A few fixes to the MFU/MBU code #1108

finbarrtimbers · 2025-10-22T14:43:27Z

Previously, we would occasionally get >100%.

Fixes #1098. Also adds logging so we can reproduce calculations if there are future issues with MFU/MBU calculations.

Fixes were:

We now properly account for sliding window attention.
Updated A100 memory bandwidth to be for the 80GB version and not the 40GB one.
Fixed a bug in the way we were account for the number of heads (taken from the vLLM parallel_config, which is wrong, as that is divided by the way we shard).
Properly accounts for multiple devices.

Note

Refactors and fixes MFU/MBU computations to account for sliding-window attention and multi-engine setups, updates GPU specs, integrates new APIs in training/benchmark paths, and adds targeted tests with reproduction cases.

Metrics/Utilization (core):
- Add ModelDims.calculate_mfu/mbu, calculate_actor_utilization, calculate_learner_utilization and use them in grpo_fast.py and benchmark_generators.py.
- Rework FLOPs/memory math to handle sliding-window attention and correct KV head counts; adjust KV read/write and per-layer computations.
- Implement per-engine memory_bytes(prompt_lengths, num_engines, num_gpus_per_engine, ...) to normalize MBU across parallel engines.
- Add check_calculation warning with repro JSON when MFU/MBU > 100%.
Model/VLLM integration:
- ModelDims.from_vllm_config pulls heads directly from hf_text_config and captures sliding-window layer counts.
GPU specs:
- Update A100 80GB memory bandwidth to 2.0e12 B/s.
Training/Benchmark wiring:
- Replace inline MFU/MBU math with calls to new ModelDims APIs; derive num_engines/num_gpus_per_engine for correct normalization.
Tests & data:
- Add MBU regression cases in open_instruct/test_data/mbu_reproduction_cases.json.
- New tests in test_utils.py for FLOPs/memory, MFU/MBU (incl. multi-engine) and from_vllm_config parity.

^{Written by Cursor Bugbot for commit b839f17. This will update automatically on new commits. Configure here.}

Added back all docstrings and inline comments that were removed during the sliding window implementation. These comments explain the assumptions, calculations, and design decisions in the FLOP and memory bandwidth estimation code. Changes: - Restored docstrings for all ModelDims methods (attn_flops, mlp_flops, prefill_flops, decode_flops, flops, weight_memory_bytes, kv_cache_write_bytes, kv_cache_read_bytes, prefill_memory_bytes, decode_memory_bytes, memory_bytes) - Restored inline comments explaining calculation details - Kept all functionality changes (sliding window support, A100 bandwidth fix) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Changed attn_flops signature from using a boolean use_sliding_window flag to accepting the sliding_window value directly as an Optional[int]. This makes the API cleaner and more explicit. Changes: - attn_flops now takes sliding_window: Optional[int] = None instead of use_sliding_window: bool = False - Uses kv_len = min(kv_len, sliding_window or float("inf")) to handle None case elegantly - Updated all call sites in prefill_flops and decode_flops to pass sliding_window=None for full attention layers and sliding_window=self.sliding_window for sliding window layers 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

finbarrtimbers · 2025-10-30T17:56:43Z

open_instruct/test_utils.py

+        prefill_flops = model_dims.flops([sequence_length], None)
+        decode_flops = total_flops - prefill_flops
+        decode_flops_in_gflops = decode_flops / 1e9
+        self.assertAlmostEqual(decode_flops_in_gflops, 27.92, delta=0.01)


This comes from some sanity checking that I did manually for the olmo3 paper.

finbarrtimbers · 2025-10-30T17:56:47Z

open_instruct/test_utils.py

+        total_bytes *= 2
+
+        memory_in_gb = total_bytes / 1e9
+        self.assertAlmostEqual(memory_in_gb, 3.926, delta=0.01)


This comes from some sanity checking that I did manually for the olmo3 paper.

finbarrtimbers mentioned this pull request Oct 22, 2025

Unsure: actor MBU might have a bug #1098

Open

finbarrtimbers marked this pull request as ready for review October 28, 2025 18:42

finbarrtimbers added 6 commits October 28, 2025 12:44

Now, we get num_attention_heads from the hf config.

bad8e77

Update code

76600a8

Added test that we match manual values

088d486

Updated calculations

d37f591

Updated code with check_calculation

4c185b4

Updated code

a68ba0d

finbarrtimbers force-pushed the fix-modeldims branch from c2b547f to a68ba0d Compare October 28, 2025 18:44

This comment was marked as outdated.

Sign in to view

Now, tests pass.

1c1de09

This comment was marked as outdated.

Sign in to view

finbarrtimbers added 2 commits October 28, 2025 14:51

Updated code to normalize properly

b4fb73d

Added some fixes

fc6c709

This comment was marked as outdated.

Sign in to view

Merge branch 'main' into fix-modeldims

d9191c0

This comment was marked as outdated.

Sign in to view

finbarrtimbers added 3 commits October 29, 2025 09:47

Updated code

f0972e4

Updated code

82ee5a9

Another fix

a67d501

This comment was marked as outdated.

Sign in to view

finbarrtimbers and others added 8 commits October 29, 2025 13:28

Updated code to fix errors from cursor review

c7afce7

Merge branch 'main' into fix-modeldims

72ca29b

Cleaned up tests.

839162b

cleaned up code

e7d697e

Cleaned up PR

427cd48

updated code

b94921c

This comment was marked as outdated.

Sign in to view

finbarrtimbers added 8 commits October 29, 2025 14:46

Fixed bug in tests

b944834

Updates code

cb0f732

Merge branch 'main' into fix-modeldims

df2a9df

Now, linter passes.

e533b18

Update MFU/MBU code.

6cc511d

Now, mbu tests pass.

e695691

Moved to json file

daa12d4

Added test data

2d25297

finbarrtimbers commented Oct 30, 2025

View reviewed changes

This comment was marked as outdated.

Sign in to view

finbarrtimbers added 8 commits October 30, 2025 12:01

undid changes and simplified test function.

e1b975b

Merge branch 'main' into fix-modeldims

b48b76d

Updated code.

bca0c4e

Updated code

11b4c9e

test passes

bf1e73c

An attempt at a fix

d9ce0cb

Update code with patches

f1a3d6c

now, tests pass

16b5e9d

finbarrtimbers enabled auto-merge October 30, 2025 20:33

finbarrtimbers mentioned this pull request Oct 31, 2025

Adds MFU to the metrics DPO logs. #1126

Open

Merge branch 'main' into fix-modeldims

b839f17

finbarrtimbers mentioned this pull request Oct 31, 2025

Changes the DPO + finetune scripts to provide progress updates in the Beaker description. #1127

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

A few fixes to the MFU/MBU code #1108

A few fixes to the MFU/MBU code #1108

Uh oh!

finbarrtimbers commented Oct 22, 2025 •

edited by cursor bot

Loading

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

finbarrtimbers Oct 30, 2025

Uh oh!

finbarrtimbers Oct 30, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

A few fixes to the MFU/MBU code #1108

Are you sure you want to change the base?

A few fixes to the MFU/MBU code #1108

Uh oh!

Conversation

finbarrtimbers commented Oct 22, 2025 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

finbarrtimbers Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

finbarrtimbers Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

finbarrtimbers commented Oct 22, 2025 •

edited by cursor bot

Loading