Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restarting the searches #105

Closed
llmlrnr opened this issue Jun 7, 2024 · 8 comments
Closed

Restarting the searches #105

llmlrnr opened this issue Jun 7, 2024 · 8 comments

Comments

@llmlrnr
Copy link

llmlrnr commented Jun 7, 2024

Hello,

Many thanks for the great software and the detailed project description!

I am using XiSearch on a linux cluster and run out of available computing time when performing very complex searches. Is there a way to restart a search that was cancelled at some point or the only way to make it work is to restrict the search space as much as possible?

@lutzfischer
Copy link
Member

Not directly - the best I could suggest is to filter the peaklist to only contain spectra without result. xiSEACH even comes with a tool for that.

java -cp /path/to/xiSearch.jar rappsilber.gui.localapplication.ScanFilter

you could load the search result into the third tab (ScanFilter) and select "Exclude Selected". On the first tab you give your original mgf file and a path to a new one and press run.

@lutzfischer
Copy link
Member

How many variable modifications are you searching?

@llmlrnr llmlrnr closed this as completed Jun 7, 2024
@llmlrnr
Copy link
Author

llmlrnr commented Jun 7, 2024

Thanks a lot for the suggestion! I had cysteine-carbamidomethylation and methionine-oxidation as two variable modifications but it definitely makes sense to set Cys-carbamidomethylation as a fixed modification!

I think the searches also take too long because I include the NonCovalentBound crosslinker along with my desired crosslinker, but it is the only way how the search can be performed, right?

@lutzfischer
Copy link
Member

unless your crosslinker is reacting to cystein you probably set cysteine-carbamidomethylation as fixed - and even then other ways would be better - like having an additional crosslinker that is carbamidomethylation-lighter then the actuall crosslinker and links to cysteine-carbamidomethylation.

Long run time could also come from lag of memory. If you can try to give it more memory

@grandrea
Copy link
Contributor

grandrea commented Jun 7, 2024 via email

@llmlrnr
Copy link
Author

llmlrnr commented Oct 11, 2024

Hello again,

Is there a way to perform peaklist splitting before starting the search?
So instead of filtering the peaklist after initial searches I would like to create 2-3 peakfiles out of one, use them for the search in parallel, and combine the results afterwards. Are you aware of any tools that can do that?

@grandrea
Copy link
Contributor

grandrea commented Oct 11, 2024

Hi,
Yes you can use msconvert for this (subset by time or scan event), or simply write a script using pyteomics pyhton package to split the mgf files (chatgpt4 knows the pyteomics syntax quite well). You can then use the scripts in the hpc_scripts in this repo to run in parallel on a cluster (or whatever script you like of course!) and then combine_searches.py or the like to recombine the results.

@llmlrnr
Copy link
Author

llmlrnr commented Oct 11, 2024

I guess this is exactly what I needed, thank you!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants