-
Notifications
You must be signed in to change notification settings - Fork 35
Open
Labels
Priority: InUImportant & not UrgentImportant & not Urgentcode-qualityimproving code quality/consistencyimproving code quality/consistencygood first issueGood for newcomersGood for newcomers
Description
When using DDP and gradient accumulation at the same time, we should use the DDP.no_sync context manager to get some free training speed.
https://chatgpt.com/share/e/68efa990-9ee8-800c-99da-b078c3d2ac7b

Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
Priority: InUImportant & not UrgentImportant & not Urgentcode-qualityimproving code quality/consistencyimproving code quality/consistencygood first issueGood for newcomersGood for newcomers