Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to train using main.py using multiple GPUs? #10

Open
alvations opened this issue May 28, 2019 · 3 comments
Open

How to train using main.py using multiple GPUs? #10

alvations opened this issue May 28, 2019 · 3 comments

Comments

@alvations
Copy link

@yikangshen @shawntan Is there an easy way to train the model to replicate the experiments using main.py using multiple GPUs?

When using model = nn.DataParallel(model) before train(), the initialization goes into the LSTM stack and then the ONLSTM cell to return the weights but it throws an error.

We also tried doing the model = nn.DataParallel(model) after the hidden = model.init_hidden(args.batch_size) and it seems like the LinearDropConnect layer can't access the .weight tensors.

@yikangshen
Copy link
Owner

We didn't try to train the model with multiple GPUs. Maybe you need to rewrite the code for LinearDropConnect function

@BuaaAlban
Copy link

Another question, it seems no speed up using GPU compared with CPU, have you met the same problem?
Both take 280-290 s each epoch

@Shiweiliuiiiiiii
Copy link

@yikangshen @shawntan Is there an easy way to train the model to replicate the experiments using main.py using multiple GPUs?

When using model = nn.DataParallel(model) before train(), the initialization goes into the LSTM stack and then the ONLSTM cell to return the weights but it throws an error.

We also tried doing the model = nn.DataParallel(model) after the hidden = model.init_hidden(args.batch_size) and it seems like the LinearDropConnect layer can't access the .weight tensors.

Hi,
Just want to know have you figured this out?
Best

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants