Inconsistent p.ndim usage in Muon optimizer

There's an inconsistency in how p.ndim is used in the optimize.py file. In the get_optimizer function, parameters are selected with the condition p.ndim >= 2, but in the Muon class constructor, there's an assertion assert p.ndim == 2, p.ndim which only allows 2D parameters.

This inconsistency can cause issues when using the Muon optimizer with parameters that have more than 2 dimensions.
```
In get_optimizer
muon_params = [
    p
    for name, p in model.named_parameters()
    if p.ndim >= 2 and "classifiers" not in name and "embedding" not in name
]
```
```
In Muon class constructor
for p in muon_params:
    # Use Muon for every parameter in muon_params which is >= 2D and doesn't look like an embedding or head layer
    assert p.ndim == 2, p.ndim
    self.state[p]["use_muon"] = True
```

could you give any suggestions?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Inconsistent p.ndim usage in Muon optimizer #37

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Inconsistent p.ndim usage in Muon optimizer #37

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions