Hi, thanks for your great work on D-FINE!
I’m currently using the D-FINE-S model as a baseline and noticed a significant discrepancy in the reported number of parameters:
Running tools/benchmark/get_info.py gives ~10M parameters.
Running train.py --test-only (which prints model info during initialization) reports ~2M parameters.
I’m a bit confused about this large difference. Could you kindly clarify:
What might be causing this discrepancy?
Which value should be considered the “true” parameter count when reporting results (e.g., in a research paper)?
Any insight would be greatly appreciated! Thank you in advance for your time and support.