Skip to content

Commit d46dc34

Browse files
Merge pull request #3621 from AI-Hypercomputer:fix-math-rendering
PiperOrigin-RevId: 897257278
2 parents 38112ca + ae1c4e3 commit d46dc34

1 file changed

Lines changed: 1 addition & 1 deletion

File tree

docs/reference/performance_metrics.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ MFU = \\frac{\\text{theoretically optimal step time}}
4848
{\\text{measured step time}}
4949
$$
5050

51-
Finally, we can also look at throughput utilization. In each training step, the model processes $(batch_size x seq_length)$ tokens. Since the (optimal or measured) number of tokens per second is just the number of tokens per step divided by step time (optimal or measured, respectively), we get that:
51+
Finally, we can also look at throughput utilization. In each training step, the model processes $\text{batch\_size} \times \text{seq\_length}$ tokens. Since the (optimal or measured) number of tokens per second is just the number of tokens per step divided by step time (optimal or measured, respectively), we get that:
5252

5353
$$
5454
MFU = \\frac{\\text{theoretically optimal step time}}

0 commit comments

Comments
 (0)