diff --git a/docker-entrypoint.sh b/docker-entrypoint.sh index 957c351..547e003 100755 --- a/docker-entrypoint.sh +++ b/docker-entrypoint.sh @@ -42,7 +42,7 @@ OPTIONS: ENVIRONMENTS: boltz For boltz1 and boltz2 models - protenix For protenix model + protenix For protenix model rf3 For RF3 model EXAMPLES: diff --git a/pyproject.toml b/pyproject.toml index 613a784..3a3b2a7 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -182,11 +182,6 @@ include = ["src/sampleworks/eval/bond_angle_and_length_outlier_eval_script.py"] possibly-missing-attribute = "ignore" [tool.ty.rules] -# Pre-existing type issues across the codebase; warn instead of error -# so ty runs in CI without blocking PRs while the team fixes them. -unresolved-import = "ignore" -unknown-argument = "warn" -unresolved-attribute = "warn" invalid-argument-type = "warn" invalid-assignment = "warn" invalid-method-override = "warn" @@ -195,6 +190,11 @@ no-matching-overload = "warn" not-iterable = "warn" not-subscriptable = "warn" too-many-positional-arguments = "warn" +unknown-argument = "warn" +unresolved-attribute = "warn" +# Pre-existing type issues across the codebase; warn instead of error +# so ty runs in CI without blocking PRs while the team fixes them. +unresolved-import = "ignore" unsupported-operator = "warn" unused-ignore-comment = "warn" unused-type-ignore-comment = "warn" diff --git a/scripts/eval/EVALUATION.md b/scripts/eval/EVALUATION.md index c129c26..8fe3c5f 100644 --- a/scripts/eval/EVALUATION.md +++ b/scripts/eval/EVALUATION.md @@ -9,33 +9,33 @@ where you run SampleWorks. The script scripts/eval/run_and_process_tortoize.py w `tortoize` executable before running and will raise an error if it is not available. ## phenix -Information about the phenix package can be found at https://phenix-online.org/. Phenix requires a +Information about the phenix package can be found at https://phenix-online.org/. Phenix requires a license which is free to academic users. Others may have to pay a fee. Sampleworks makes use of the phenix.clashscore command and `run_and_process_phenix_clashscore.py` will check for it before running, raising an error if it is not available. # Running the evaluations ## Preparing the output CIF files -As of this writing, Sampleworks outputs CIF files that primarily contain the output atomic -coordinates, and not the additional information that many programs, like `tortoize` and -`phenix.clashscore`, require. Furthermore, many protein structure predictors effectively -renumber residues. Since our metrics are frequently calculated by comparing selections of atoms or -residues, we must align to the original _sequence_ of the protein as well. Future versions of +As of this writing, Sampleworks outputs CIF files that primarily contain the output atomic +coordinates, and not the additional information that many programs, like `tortoize` and +`phenix.clashscore`, require. Furthermore, many protein structure predictors effectively +renumber residues. Since our metrics are frequently calculated by comparing selections of atoms or +residues, we must align to the original _sequence_ of the protein as well. Future versions of Sampleworks will handle these issues automatically. For now, you should run the script -`scripts/patch_output_cif_files.py`. This will use the original PDB inputs to reconstruct proper +`scripts/patch_output_cif_files.py`. This will use the original PDB inputs to reconstruct proper output CIF files that are numbered correctly and have all necessary metadata to reconstruct the protein structure correctly. You can run the following command, which assumes: -- your sampleworks output is stored in `/home/ubuntu/grid_search_results`, +- your sampleworks output is stored in `/home/ubuntu/grid_search_results`, - the output is organized by RCSB PDB ID in directories like `/home/ubuntu/grid_search_results/1VME/...`, see the `--rcsb-pattern` argument which is a regex to match the RCSB PDB ID - the input PDB cif files are stored in `/home/ubuntu/grid_search_inputs` as required for running the the grid search (see GRID_SEARCH.md) - the input PDB cif files are stored in `/home/ubuntu/grid_search_inputs` as required for running the - the grid search (see GRID_SEARCH.md). The files will have paths like, e.g., - `/home/ubuntu/grid_search_inputs/1VME/1VME_original.cif`. See also the `--input-pdb-pattern` - argument, which is a python format string which must use the `pdb_id` variable to refer to the + the grid search (see GRID_SEARCH.md). The files will have paths like, e.g., + `/home/ubuntu/grid_search_inputs/1VME/1VME_original.cif`. See also the `--input-pdb-pattern` + argument, which is a python format string which must use the `pdb_id` variable to refer to the RCSB PDB ID. ```shell @@ -53,7 +53,7 @@ argument. It will output a patched CIF files named `refined-patched.cif` along e file. These `refined-patched.cif` files can be used as input to the remaining evaluation scripts. ## Running the scripts -The evaluation scripts have a common interface defined by the method +The evaluation scripts have a common interface defined by the method `sampleworks.eval.grid_search_eval_utils.parse_eval_args`. The general form of these commands is: ```shell @@ -66,7 +66,7 @@ pixi run -e analysis python scripts/eval/