PARM does not allow all IUPAC nucleotide notations. E.g. 'R' and 'Y' for purine and pyrimidine are not supported and will cause PARM to crash.
It would be preferable if PARM checks if FASTA input complies with its requirements before running and crashing unexpectedly.
I would suggest that PARM converts all non ATGCN letters to 'N', while throwing a warning. Alternatively, PARM could discard the non-compliant FASTA entries, while throwing a warning. Yet another possibility would be to throw an error and prevent PARM from running.
Example error:
20%|█████▌ | 1026933/5122413 [4:03:00<14:14:37, 79.87it/s]Traceback (most recent call last):
File "/foo/bin/parm", line 9, in <module>
sys.exit(main())
File "/foo/lib/python3.10/site-packages/PARM/__main__.py", line 70, in main
args.func(args)
File "/foo/lib/python3.10/site-packages/PARM/__main__.py", line 132, in predict
PARM_predict(
File "/foo/lib/python3.10/site-packages/PARM/PARM_predict.py", line 121, in PARM_predict
get_prediction(tmp.sequence.to_list(), model)
File "/foo/lib/python3.10/site-packages/PARM/PARM_predict.py", line 322, in get_prediction
np.float32(sequence_to_onehot(sequence, L_max=len(sequence[0])))
File "/foo/lib/python3.10/site-packages/PARM/PARM_predict.py", line 354, in sequence_to_onehot
x = np.array([letter_to_vector[s] for s in seq])
File "/foo/lib/python3.10/site-packages/PARM/PARM_predict.py", line 354, in <listcomp>
x = np.array([letter_to_vector[s] for s in seq])
KeyError: 'R'
20%|█████▌ | 1026936/5122413 [4:03:00<16:09:09, 70.43it/s]
PARM does not allow all IUPAC nucleotide notations. E.g. 'R' and 'Y' for purine and pyrimidine are not supported and will cause PARM to crash.
It would be preferable if PARM checks if FASTA input complies with its requirements before running and crashing unexpectedly.
I would suggest that PARM converts all non ATGCN letters to 'N', while throwing a warning. Alternatively, PARM could discard the non-compliant FASTA entries, while throwing a warning. Yet another possibility would be to throw an error and prevent PARM from running.
Example error: