using mgf files created by ProteomeDiscoverer 2.3 #49

RicZen · 2019-12-12T15:30:36Z

Dear XiSearch team
I usually like to create mgf files with PD specific node because I can easily filter with S/N thresholds and is quite faster than msconvert. Furthermore, if I am doing many fractions I can also directly merge them in 1 mgf file that is easier to send and store.
the problem is that mgf files look for example like this:
BEGIN IONS
TITLE=35319
PEPMASS=678.67432 554741.87500
CHARGE=3+
RTINSECONDS=3213
SCANS=15601

the TITLE is an increasing number from 1 to the last scan in the mgf file while SCANS is relative to the original scan number in the raw file.

would it be possible to use this type of Mgf in xiSearch?

lutzfischer · 2019-12-16T12:46:43Z

it should work if you give

RUN_RE=TITLE=.*
SCAN_RE=TITLE=.*

The scan will be overwritten by the SCANS-entry. But you will lose the information of what was the originating run/raw-file.

RicZen · 2019-12-16T18:20:08Z

should i change the BasicConfig.conf file then?

now the mgf section is like this:

#########################################
## we need the run name and scan number for a spectrum
## but mgf-files have that info (if at all) in the TITLE line
## and it is not exactly defined how that is stored
## some mgf-files that we have encountered are already recognized for others
## the following to regular expressions can be defined to read out scan number and run
## if both are supplied these will be first tried before the internal automatic will be used
## the scan number and the raw file need to be in the first capturing group
## Example:
## the mgf contains headers like:
## TITLE= Elution from: -1.0 to -1.0 period:  experiment:  cycles:  precIntensity: -1.0 RawFile: myrawfile FinneganScanNumber: 14846
## then the regular expressions should be defined as
## SCAN_RE: .*FinneganScanNumber:\s*([0-9]*)\s*
## RUN_RE: .*RawFile:\s*(.*)FinneganScanNumber:
##
## xiSEARCH comes with a range of know regular mgf-title formats but there are a lot 
## more formats out there. So if you encounter an error that the file is not known try this.
## 
#SCAN_RE:
#RUN_RE:

should i change last 2 lines with

#RUN_RE=TITLE=.*
#SCAN_RE=TITLE=.*

lutzfischer · 2019-12-17T08:55:36Z

Almost - everything in there that starts with a # is a comment and is ignored by xiSEARCH. Meaning you need to remove the # before the RUN_RE and SCAN_RE.
Otherwise it does not really matter where you put these lines.

If you care about what file an identification comes from I would not combine the raws into a single mgf-file. At least this way you can reconstruct where the spectra comes from based on the name of the mgf-file.

Rappsilber-Laboratory locked and limited conversation to collaborators Nov 11, 2022

lutzfischer converted this issue into discussion #69 Nov 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

using mgf files created by ProteomeDiscoverer 2.3 #49

using mgf files created by ProteomeDiscoverer 2.3 #49

RicZen commented Dec 12, 2019

lutzfischer commented Dec 16, 2019 •

edited

Loading

RicZen commented Dec 16, 2019 •

edited by lutzfischer

Loading

lutzfischer commented Dec 17, 2019

This issue was moved to a discussion.

This issue was moved to a discussion.

using mgf files created by ProteomeDiscoverer 2.3 #49

using mgf files created by ProteomeDiscoverer 2.3 #49

Comments

RicZen commented Dec 12, 2019

lutzfischer commented Dec 16, 2019 • edited Loading

RicZen commented Dec 16, 2019 • edited by lutzfischer Loading

lutzfischer commented Dec 17, 2019

This issue was moved to a discussion.

lutzfischer commented Dec 16, 2019 •

edited

Loading

RicZen commented Dec 16, 2019 •

edited by lutzfischer

Loading