Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to reproduce results for spacetimeformer #93

Open
TayyabaZainab0807 opened this issue Feb 15, 2024 · 3 comments
Open

Unable to reproduce results for spacetimeformer #93

TayyabaZainab0807 opened this issue Feb 15, 2024 · 3 comments

Comments

@TayyabaZainab0807
Copy link

I am using the following command that is provided by this repo (only difference is the batch_size)

python train.py spacetimeformer solar_energy --context_points 168 --target_points 24 --d_model 100 --d_ff 400 --enc_layers 5 --dec_layers 5 --l2_coeff 1e-3 --dropout_ff .2 --dropout_emb .1 --d_qk 20 --d_v 20 --n_heads 6 --run_name spatiotemporal_al_solar --batch_size 3 --class_loss_imp 0 --initial_downsample_convs 1 --decay_factor .8 --warmup_steps 1000

I am getting the following results (where as I am expecting MSE: ~7.75):
test/acc -1.0
test/class_loss 0.0
test/forecast_loss 0.08704246580600739
test/loss 0.08704246580600739
test/mae 1.7290080221756612
test/mape 21375719.51865129
test/mse 9.604532779042728
test/norm_mae 0.1794128092716004
test/norm_mse 0.0870416207133817
test/recon_loss -1.0
test/smape 1.4066449396255207

@jakegrigsby
Copy link
Member

There is a big difference between a batch size of 32 and a batch size of 3!

This is probably the most memory-intensive dataset in the paper. It should be possible to get closer to 7.75 using a larger batch size but a smaller model with more of the memory saving tricks

@TayyabaZainab0807
Copy link
Author

Thanks, that works for spacetimeformer model.
I am not able to find the right commands to replicate the other results for lstm, lstnet etc. Are you able to provide that as well?

@HariniS2506
Copy link

I used the same command and with GPUs set -

python train.py spacetimeformer solar_energy --context_points 168 --target_points 24 --d_model 100 --d_ff 400 --enc_layers 5 --dec_layers 5 --l2_coeff 1e-3 --dropout_ff .2 --dropout_emb .1 --d_qk 20 --d_v 20 --n_heads 6 --run_name spatiotemporal_al_solar --batch_size 32 --class_loss_imp 0 --initial_downsample_convs 1 --decay_factor .8 --warmup_steps 1000 --gpus 0 1, and still see the MSE at ~9. Were there any other changes that you had to do?

test/acc : -1.0
test/class_loss : 0.0
test/forecast_loss : 0.08345500379800797
test/loss : 0.08345500379800797
test/mae : 1.7061005777056486
test/mape : 20300980.869295612
test/mse : 9.204660098343183
test/norm_mae : 0.17708396412653812
test/norm_mse : 0.08345417754154151
test/recon_loss : -1.0
test/smape : 1.403714211316733

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants