SLURMCluster jobs not running when using parameters from dask.yaml

I am using a `SLURMCluster` object to run some simple python functions in parallel on an HPC cluster. When I run the script by manually passing each parameter to the `SLURMCluster` object, the jobs are submitted, connect, run, and return properly. However, when I move those parameters to a `dask.yaml` file (in `~/.config/dask/dask.yaml`), the job submit but never connect, finish, and return, but instead hang until I kill the running python process and cancel the subequently submitted jobs. Both ways yield the same job script with identical options specified. 

What could be causing this? 

Below are copies of my `dask.yaml` file, as well as the `SLURMCluster` object with parameters that I use when I manually specifying everything:

```py

CORES=2

#### This works
cluster = SLURMCluster(name='worker_bee',                                                                                                                                                              
                       queue='normal',                                                                                                                                                                 
                       project='TG-EAR180014',                                                                                                                                                         
                       processes=1,                                                                                                                                                                    
                       cores=CORES,                                                                                                                                                                    
                       memory='2GB',                                                                                                                                                                   
                       interface='ib0',                                                                                                                                                                
                       header_skip=['--mem', '--cpus-per-task='],                                                                                                                                      
                       job_extra=['-N {}'.format(CORES)]
)


#### This doesn't work
cluster = SLURMCluster()   

```

dask.yaml
```yaml

jobqueue:
  slurm:
    name: worker-bee
    project: TG-EAR180014
    queue: normal

    cores: 2
    memory: 2GB
    processes: 1

    interface: ib0
    death-timeout: 60           # Number of seconds to wait if a worker can not find a scheduler
    local-directory: null       # Location of fast local storage like /scratch or $TMPDIR

    # LSF resource manager options
    shebang: "#!/usr/bin/env bash"
    walltime: '00:30'
    extra: []
    env-extra: []
    ncpus: null
    header-skip: ['--mem', '--cpus-per-task=']

    job-extra: ['-N 2']
    log-directory: null

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

SLURMCluster jobs not running when using parameters from dask.yaml #394

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

SLURMCluster jobs not running when using parameters from dask.yaml #394

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions