Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

using mgf files created by ProteomeDiscoverer 2.3 #49

Closed
RicZen opened this issue Dec 12, 2019 · 3 comments
Closed

using mgf files created by ProteomeDiscoverer 2.3 #49

RicZen opened this issue Dec 12, 2019 · 3 comments

Comments

@RicZen
Copy link

RicZen commented Dec 12, 2019

Dear XiSearch team
I usually like to create mgf files with PD specific node because I can easily filter with S/N thresholds and is quite faster than msconvert. Furthermore, if I am doing many fractions I can also directly merge them in 1 mgf file that is easier to send and store.
the problem is that mgf files look for example like this:
BEGIN IONS
TITLE=35319
PEPMASS=678.67432 554741.87500
CHARGE=3+
RTINSECONDS=3213
SCANS=15601

the TITLE is an increasing number from 1 to the last scan in the mgf file while SCANS is relative to the original scan number in the raw file.

would it be possible to use this type of Mgf in xiSearch?

@lutzfischer
Copy link
Member

lutzfischer commented Dec 16, 2019

it should work if you give

RUN_RE=TITLE=.*
SCAN_RE=TITLE=.*

The scan will be overwritten by the SCANS-entry. But you will lose the information of what was the originating run/raw-file.

@RicZen
Copy link
Author

RicZen commented Dec 16, 2019

should i change the BasicConfig.conf file then?

now the mgf section is like this:

#########################################
## we need the run name and scan number for a spectrum
## but mgf-files have that info (if at all) in the TITLE line
## and it is not exactly defined how that is stored
## some mgf-files that we have encountered are already recognized for others
## the following to regular expressions can be defined to read out scan number and run
## if both are supplied these will be first tried before the internal automatic will be used
## the scan number and the raw file need to be in the first capturing group
## Example:
## the mgf contains headers like:
## TITLE= Elution from: -1.0 to -1.0 period:  experiment:  cycles:  precIntensity: -1.0 RawFile: myrawfile FinneganScanNumber: 14846
## then the regular expressions should be defined as
## SCAN_RE: .*FinneganScanNumber:\s*([0-9]*)\s*
## RUN_RE: .*RawFile:\s*(.*)FinneganScanNumber:
##
## xiSEARCH comes with a range of know regular mgf-title formats but there are a lot 
## more formats out there. So if you encounter an error that the file is not known try this.
## 
#SCAN_RE:
#RUN_RE:

should i change last 2 lines with

#RUN_RE=TITLE=.*
#SCAN_RE=TITLE=.*

@lutzfischer
Copy link
Member

Almost - everything in there that starts with a # is a comment and is ignored by xiSEARCH. Meaning you need to remove the # before the RUN_RE and SCAN_RE.
Otherwise it does not really matter where you put these lines.

If you care about what file an identification comes from I would not combine the raws into a single mgf-file. At least this way you can reconstruct where the spectra comes from based on the name of the mgf-file.

@Rappsilber-Laboratory Rappsilber-Laboratory locked and limited conversation to collaborators Nov 11, 2022
@lutzfischer lutzfischer converted this issue into discussion #69 Nov 11, 2022

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants