-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"ConnectTimeout" error on HPC cluster. #376
Comments
We do observe sporadic errors in our CI to the tune of what you're seeing, but we haven't been able to pin them down yet. Any more info people can send to help us reproduce these issues would be greatly appreciated. |
@jacob-votava if you end up working toward a solution, could you post a update here? I vaguely recall that setting memory limits for various steps helps things. Accumulating these learned lessons in the public can be really helpful for other users (including future you and future me) |
Sorry for the long response @mattwthompson. I got it to (temporarily) work by limiting the memory as well as explicitly selecting an open port on my university's cluster. (I think I mentioned that in the slack channel) However I just tried it on a different system to see if it would work, and the error seems to persist when all I do is change SMILES strings. I'm unsure about what is causing it. Lots of the issues seem to have to do with the redis/https stuff but I'm not really sure how any of that works. Thank you for the responses, I apologize that I couldn't be more helpful. |
Hi!
I'm currently trying to use the python API to parameterize a set of fairly large (~150 atom) molecules from their smiles strings. I'm doing it on my cluster using the following function.
fn.txt
I execute this through a SLURM command using an environment built exactly as suggested in the documentation. When I run this script on a set of small molecules (["CCCC","CCCCO"]) the optimization (sometimes*) works. However, when I scale it up to my system of much larger molecules (for example**) my job fails at the qcgeneration step with the attached connection errors
err.txt
I was wondering if anyone in the community has had a similar issue, and if so how did you fix it? This could be my script, but I was trying to closely follow tutorials listed on the documentation. Is it maybe an issue with how my cluster in configured?
Thanks in advance!
~Jacob
*When I tried rerunning with these molecules I got the following error:
err2.txt
**("O=C(OC1=CC=C(OCCCCCCCCN2C=C(CCCCOC3=CC=C(C(OC4=CC=C(OCCCCCCCCCl)C=C4)=O)C=C3)N=N2)C=C1)C5=CC=C(OCCCCC#C)C=C5C" )
The text was updated successfully, but these errors were encountered: