Currently in src/dataset.py the parsing function and the symmetry mate functions process and select the intersection of water molecules, i.e. waters that are common when parsed by both pymol and biotite (in the case we includes symmetry mates).
We would likely want to consider and plan to account for the fact that some waters that appear far from the protein might just be closer to another protein copy related by symmetry or off by some unit cell length. An approach could be to apply symmetry operation and "correct" the positions of such waters.
Currently in
src/dataset.pythe parsing function and the symmetry mate functions process and select the intersection of water molecules, i.e. waters that are common when parsed by both pymol and biotite (in the case we includes symmetry mates).We would likely want to consider and plan to account for the fact that some waters that appear far from the protein might just be closer to another protein copy related by symmetry or off by some unit cell length. An approach could be to apply symmetry operation and "correct" the positions of such waters.