You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to run a multi-node, multi-host SSH cluster on Windows. I simplified it, for now, attempting to run both the scheduler and the workers on localhost. Based on the Dask documentation instructions, I setup public key SSH access, in this case, from localhost to localhost. Encountered this issue and fixed it by the recommended fix in the same link. Then encountered the next issue, which has to do with trying to run a command which is over the character limit imposed by Windows.
The above line from the "distributed\deploy\ssh.py", generates a string of 9000+ chars. Which seems to be a problem.
The next line of code creates the command "cmd", and the following line starts the process: self.proc = await self.connection.create_process(cmd)
and the below line extracts this error - 'The command line is too long.\r\n': line = await self.proc.stderr.readline()
In an attempt to reduce the size of the serialized config, I have tried removing the Kubernetes key from the dask.config.global_config, and re-adding it with an empty dict as value, thinking I should not need Kubernetes, since I am using the SSHCluster and not KubeCluster. When serializing the config, the length is less than the limit, and sure enough, I seem to get past the 'The command line is too long' error but get stuck with the below error instead:
2023-08-28 21:10:06,883 - distributed.deploy.ssh - INFO - raise JSONDecodeError("Expecting value", s, err.value) from None
2023-08-28 21:10:06,883 - distributed.deploy.ssh - INFO - json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
I am using Windows right now, and am considering installing a Linux VM to try this out. Was wondering if anyone has had this issue with Windows and what can be done to workaround it?
I encountered the same issue. I've found the fix for the JSON issue and also found a way to reduce the size of the command by some amount. With the below changes (3 lines with the comments), I no longer see the issue:
cmd = " ".join(
[
#set_env, -> Removed this to shorten cmd; it is executed before cmd to preserve functionality
self.remote_python,
"-m",
"distributed.cli.dask_spec",
"--spec",
'"%s"' % dumps({"cls": "distributed.Scheduler", "opts": self.kwargs}).replace('"', '\\"'), # exchanged places of ' and " at the beginning to fix the json issue
]
)
await self.connection.run(set_env) # added this due to removal above
self.proc = await self.connection.create_process(cmd)
I am trying to run a multi-node, multi-host SSH cluster on Windows. I simplified it, for now, attempting to run both the scheduler and the workers on localhost. Based on the Dask documentation instructions, I setup public key SSH access, in this case, from localhost to localhost. Encountered this issue and fixed it by the recommended fix in the same link. Then encountered the next issue, which has to do with trying to run a command which is over the character limit imposed by Windows.
The above line from the "distributed\deploy\ssh.py", generates a string of 9000+ chars. Which seems to be a problem.
The next line of code creates the command "cmd", and the following line starts the process:
self.proc = await self.connection.create_process(cmd)
and the below line extracts this error - 'The command line is too long.\r\n':
line = await self.proc.stderr.readline()
In an attempt to reduce the size of the serialized config, I have tried removing the Kubernetes key from the dask.config.global_config, and re-adding it with an empty dict as value, thinking I should not need Kubernetes, since I am using the SSHCluster and not KubeCluster. When serializing the config, the length is less than the limit, and sure enough, I seem to get past the 'The command line is too long' error but get stuck with the below error instead:
I am using Windows right now, and am considering installing a Linux VM to try this out. Was wondering if anyone has had this issue with Windows and what can be done to workaround it?
This is the code I am using in the main module:
Environment:
The text was updated successfully, but these errors were encountered: