train loss about training_perception

Hi, thank you for sharing GoalFlow!

I'm training the perception model using `run_goalflow_training_perception.sh`,  
but my training loss keeps oscillating and doesn't converge even after 40 epochs.  
(Attached TensorBoard screenshot shows losses for agent box/class and BEV semantic.)
<img width="1437" height="772" alt="Image" src="https://github.com/user-attachments/assets/eb18863d-f7d0-4f68-8977-75f7445b1426" />

Environment:
- 4× V100 (32GB, NVIDIA DGX)
- Batch size: 15 (to fully use VRAM)
- Epochs: 40
- Other configs: default

Questions:
1. Is this oscillating loss behavior expected, or am I missing any training setting (LR, warmup, grad clip, etc.)?
2. Why does the provided script use `batch_size=2`?  
   Does GoalFlow assume a specific global batch size or gradient accumulation setting?

Thanks for any advice!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

train loss about training_perception #36

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

train loss about training_perception #36

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions