You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I attempted to run the LSRM model using the exact configuration provided in the README: CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 --master_port=1230 run_ddp.py \ --model=Visnorm_shared_LSRMNorm2_2branchSerial \ --molecule AT_AT_CG_CG \ --group_builder rdkit \ --num_interactions=4 --long_num_layers=2 \ --learning_rate=0.0004 --rho_tradeoff 0.001 \ --dropout=0 --hidden_channels 128 \ --gradient_clip \ --calculate_meanstd --otfcutoff 4 \ --short_cutoff_upper 4 --long_cutoff_lower 0 --long_cutoff_upper 9 \ --early_stop --early_stop_patience 500 \ --no_broadcast --batch_size 16 \ --ema_decay 0.999 --dropout 0.1
The training doesn't seem to converge to the reported performance metrics. According to Table 2 in the paper, for the AT-AT-CG-CG molecule (diameter ~24Å, 118 atoms), the expected results are:
Energy prediction (MAE in kcal/mol):
ViSNet-LSRM: 0.1135
ViSNet: 0.1995
Force prediction (MAE in kcal/mol/Å):
ViSNet-LSRM: 0.1063
ViSNet: 0.1563
However, my implementation produces significantly different results:
Thank you for your comments! The current version of our code, which we internally refer to as v1, is a previous iteration of our codebase. We will be releasing version v2 soon, featuring better performance and new models such as Equiformer-LSRM. Please stay tuned!
I attempted to run the LSRM model using the exact configuration provided in the README:
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 --master_port=1230 run_ddp.py \ --model=Visnorm_shared_LSRMNorm2_2branchSerial \ --molecule AT_AT_CG_CG \ --group_builder rdkit \ --num_interactions=4 --long_num_layers=2 \ --learning_rate=0.0004 --rho_tradeoff 0.001 \ --dropout=0 --hidden_channels 128 \ --gradient_clip \ --calculate_meanstd --otfcutoff 4 \ --short_cutoff_upper 4 --long_cutoff_lower 0 --long_cutoff_upper 9 \ --early_stop --early_stop_patience 500 \ --no_broadcast --batch_size 16 \ --ema_decay 0.999 --dropout 0.1
The training doesn't seem to converge to the reported performance metrics. According to Table 2 in the paper, for the AT-AT-CG-CG molecule (diameter ~24Å, 118 atoms), the expected results are:
Energy prediction (MAE in kcal/mol):
ViSNet-LSRM: 0.1135
ViSNet: 0.1995
Force prediction (MAE in kcal/mol/Å):
ViSNet-LSRM: 0.1063
ViSNet: 0.1563
However, my implementation produces significantly different results:
ViSNet-LSRM: 0.24141 (energy) / 0.17699 (force)
ViSNet: 0.25187 (energy) / 0.18616 (force)
Any guidance or clarification would be greatly appreciated. Thank you!
The text was updated successfully, but these errors were encountered: