inconsistency when adjusting the learning rate

The report suggests the following:

<img width="540" height="79" alt="Image" src="https://github.com/user-attachments/assets/12e1c843-f58b-4f53-88d5-b37967f393c2" />

For cases where _p.ndim > 2_, what are m and n? The example code flattens the tensor but still uses the first and second dimension sizes from the original parameter shape to adjust the learning rate, rather than the flattened tensor’s size. Is there a reason for this?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

inconsistency when adjusting the learning rate #39

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

inconsistency when adjusting the learning rate #39

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions