Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to launch: Invalid wckey specification #1758

Open
rskwesterman opened this issue Dec 19, 2023 · 1 comment
Open

Failed to launch: Invalid wckey specification #1758

rskwesterman opened this issue Dec 19, 2023 · 1 comment

Comments

@rskwesterman
Copy link

I am trying to get the DINO model to train using the run_with_submitit.py from the https://github.com/facebookresearch/dino

But run into the following error:
sbatch: error: Batch job submission failed: Invalid wckey specification subprocess.CalledProcessError: Command '['sbatch', '/mydir/checkpoint/experiments/submission_file_611ca66d3a6a43f69bab82264c3d6afc.sh']' returned non-zero exit status 1.
[...]
submitit.core.utils.FailedJobError: sbatch: error: Batch job submission failed: Invalid wckey specification

However, removing the wckey requirements from the sbatch file and manually running with sbatch results in the following error:
srun: error: task 0 launch failed: Slurmd could not connect IO srun: error: task 1 launch failed: Slurmd could not connect IO

Any insights or solutions regarding the resolution of this issue would be greatly appreciated.

@yinkaaiwu
Copy link

same problem, and set slurm_wckey='' didn't work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants