Skip to content

mzIdentML Conformance to MIAPE

This table lists each point in the MIAPE guidelines and states the xpath/CV available to provide conformance Do not edit this page directly because the editor on psidev.info is useless for tables. Source for this page is under svn here. You should edit the source html and copy/paste to here. The MIAPE document is available here, and general information about MIAPE is here.
MIAPE   Section Item xPath (under   mzIdentML) Notes a b c d e f g h i j
1 Date stamp (as YYYY-MM-DD) creationDate   (attribute) The   creation date of the document itself. xsd:dateTime    
AnalysisCollection/SpectrumIdentification/activityDate (attribute) Date   spectrum identification performed. xsd:dateTime  
AnalysisCollection/ProteinDetection/activityDate (attribute) Date   protein inferencing performed. xsd:dateTime na na
Responsible person (or institutional role if   more appropriate); provide name, affiliation and stable contact information Provider/ContactRole An   institutional email address can generally satisfy this requirement.    
Software name, version and   manufacturer AnalysisSoftwareList/AnalysisSoftware/name  
AnalysisSoftwareList/AnalysisSoftware/version    
AnalysisSoftwareList/AnalysisSoftware/ContactRole  
Customisations made to that   software AnalysisSoftwareList/AnalysisSoftware/Customizations No   customisations in some examples for illustration. na na na na na na
In the other cases this is just not applicable (na).
Availability of that software AnalysisSoftwareList/AnalysisSoftware/URI The   references of the vendor or public url if a publicly available version has   been used.    
Location of the files generated; parameter   files, spectral data (input/output) DataCollection/Inputs/SourceFile The   location of the data generated. If made available in a public repository,   
describe the URI (for instance an url, or the url of the repository and the information
on how to retrieve the data).
If not made available for public   access, describe the contact person reference or source and the internal   
coordinates of the data. e.g. Sequest .out, Mascot .dat.
[Note to MIAPE   Authors: This is confusing because of overlap with next section, so we just   
consider Inputs/SourceFile here and not the .dta files etc.].
 
2 Input data – Description and type of MS data DataCollection/Inputs/SpectraData Provide   a short description that can refer to the data in the experiment (e.g. LC-MS   run1).
[Refer to mzML source file for information – outside scope of   mzIdentML]
                   
DataCollection/Inputs/SpectraData/fileFormat        
Input data – Availability of MS data (source of   data) DataCollection/Inputs/SpectraData Location   (URI) of input data file  
Input parameters – Databases   queried; description and versions (including number of entries searched) DataCollection/Inputs/SearchDatabase/DatabaseName  

 

and/or
DataCollection/Inputs/SearchDatabase/location
DataCollection/Inputs/SearchDatabase/version      
DataCollection/Inputs/SearchDatabase/numDatabaseSequences   na
Input parameters – Taxonomical   restrictions applied AnalysisProtocolCollection/SpectrumIdentificationProtocol/DatabaseFilters Specify   the … subset of the databank(s) (for instance, “mammals”, a NCBI TaxId,
a list of accession numbers).
na na na na na na na na
DataCollection/AnalysisData/SpectrumIdentificationList/numSequencesSearched Specify   the number of entries searched. na na na na na na na na
Input parameters – Description of tool and   scoring scheme AnalysisProtocolCollection/SpectrumIdentificationProtocol/AdditionalSearchParams/cvParam Descriptor   of the scoring algorithm in the search engine (such as ESI-TRAP in Mascot,   ESI…
[Note to MIAPE authors: These examples parameters are a little search   engine specific]
Input parameters – Specified cleavage agent(s) AnalysisProtocolCollection/SpectrumIdentificationProtocol/Enzymes Describe   the cleavage agent as available on the search engine.
If the cleavage agent rules have been defined by the user, describe the cleavage rules)
Input parameters – Allowed number of missed   cleavages AnalysisProtocolCollection/SpectrumIdentificationProtocol/Enzymes/Enzyme/missedCleavages Allowed   maximum number of cleavage sited missed by the specified agent during the
 in-silico cleavage process. For a no eznyme search, use the “No   Enzyme” CV term,
and omit the number of missed cleavages.
Input parameters – Additional parameters related   to cleavage AnalysisProtocolCollection/SpectrumIdentificationProtocol/Enzymes The   Enzymes section is flexible. Example ‘a’ shows a case of a mixed enzyme. na na na na na na na na na
Input parameters – Permissible amino acids   modifications AnalysisProtocolCollection/SpectrumIdentificationProtocol/ModificationParams/SearchModification Using   the PSI-MS names available from Unimod na na na na
Input parameters – Precursor-ion   and fragment ion mass tolerance for tandem MS (when applicable) AnalysisProtocolCollection/SpectrumIdentificationProtocol/FragmentTolerance  
na na
AnalysisProtocolCollection/SpectrumIdentificationProtocol/ParentTolerance
Input parameters – Mass tolerance for PMF (when   applicable) AnalysisProtocolCollection/SpectrumIdentificationProtocol/ParentTolerance   na na na na na na na na na
Input parameters – Thresholds;   minimum scores for peptides, proteins (probabilities, number of hits, other   metrics) AnalysisProtocolCollection/SpectrumIdentificationProtocol/AdditionalSearchParams/cvParam                    
AnalysisProtocolCollection/ProteinDetectionProtocol/AnalysisParams/cvParam   na     na
Input parameters – Any other relevant parameters AnalysisProtocolCollection/SpectrumIdentificationProtocol/AdditionalSearchParams/cvParam  
3 Identified   proteins – Accession code in the queried database SequenceCollection/DBSequence/accession  
Identified proteins – Protein description SequenceCollection/DBSequence/cvParam accession=”MS:1001088″   na     na
Identified proteins – Protein scores DataCollection/AnalysisData/ProteinDetectionList/ProteinAmbiguityGroup/ProteinDetectionHypothesis/cvParam   na na na
Identified proteins – Validation status DataCollection/AnalysisData/ProteinDetectionList/ProteinAmbiguityGroup/ProteinDetectionHypothesis/cvParam   
accession=”MS:1001060″
For   all protein hits in the search, specify if accepted without post-processing of search
engine/de-novo interpretation (accept raw output of identification   software) or if manually accepted
as valid or as rejected (false positive).
          na       na
Identified proteins – Number of   different peptide sequences (without considering modifications) assigned to   the protein DataCollection/AnalysisData/ProteinDetectionList/ProteinAmbiguityGroup/ProteinDetectionHypothesis/cvParam  
na  
 
na
accession=”MS:1001097″
Identified proteins – Percent   peptide coverage of protein DataCollection/AnalysisData/ProteinDetectionList/ProteinAmbiguityGroup/ProteinDetectionHypothesis/cvParam  
na  
 
na
accession=”MS:1001093″
Identified proteins – Identity of supporting   peptides DataCollection/AnalysisData/ProteinDetectionList/ProteinAmbiguityGroup/ProteinDetectionHypothesis/PeptideHypothesis   na na
Identified proteins – In the case   of PMF, number of matched/unmatched peaks DataCollection/AnalysisData/ProteinDetectionList/ProteinAmbiguityGroup/ProteinDetectionHypothesis/cvParam  

na na na na na na na na na
accession=”MS:1001097″ name=”distinct peptide sequences”
accession=”MS:1001362″ name=”number of unmatched peaks”
For identified peptides – Sequence (indicate any   deviation from the expected protein cleavage specificity) SequenceCollection/Peptide/peptideSequence  
For identified peptides – Peptide scores DataCollection/AnalysisData/SpectrumIdentificationList/SpectrumIdentificationResult/SpectrumIdentificationItem/cvParam   na
For identified peptides – Chemical modifications   (artefactual) and post-translational modifications (naturallyoccurring);   
sequence polymorphisms with experimental evidence (particularly for isobaric   modifications)
SequenceCollection/Peptide/Modification   na na na
For identified peptides – Corresponding spectrum   locus DataCollection/AnalysisData/SpectrumIdentificationList/SpectrumIdentificationResult/SpectrumIdentificationItem/start and end      
For identified peptides – Charge   assumed for identification and a measurement of peptide mass error DataCollection/AnalysisData/SpectrumIdentificationList/SpectrumIdentificationResult/SpectrumIdentificationItem/chargeState    
DataCollection/AnalysisData/SpectrumIdentificationList/SpectrumIdentificationResult/SpectrumIdentificationItem
/calculatedMassToCharge   
-DataCollection/AnalysisData/SpectrumIdentificationList/SpectrumIdentificationResult/SpectrumIdentificationItem
/experimentalMassToCharge
   
For identified peptides – Other additional   information, when used for evaluation of confidence DataCollection/AnalysisData/SpectrumIdentificationList/SpectrumIdentificationResult/SpectrumIdentificationItem/cvParam  
Quantitation for selected ions – Quantitation   approach (e.g. 4plex-iTRAQ, ICAT, cICAT, COFRADIC) Out   of scope Planned   for mzQuantML                    
Quantitation for selected ions – Quantity   measurement (e.g. integration of signals, use of signal intensity) Out   of scope Planned   for mzQuantML                    
Quantitation for selected ions – Data   transformation and normalisation technique (description of method and software) Out   of scope Planned   for mzQuantML                    
Quantitation for selected ions – Number of   replicates (biological and technical) Out   of scope Planned   for mzQuantML                    
Quantitation for selected ions – Acceptance   criteria (including measure of errors) Out   of scope Planned   for mzQuantML                    
Quantitation for selected ions – Estimates of   uncertainty and the methods for the error analysis, including the treatment   
of relevant systematic error effects and the treatment of random error   issues. Results from controls (when described)
Out   of scope Planned   for mzQuantML                    
4 Assessment and confidence given to the   identification and quantitation (description of methods, thresholds, values, etc,) AnalysisProtocolCollection/SpectrumIdentificationProtocol/Threshold For example, MS:1001316, mascot:SigThreshold  

 

and
ProteinDetectionProtocol/Threshold
4 Results   of statistical analysis or determination of false positive rate in case of   large scale experiments                    
4 Inclusion/exclusion   of the output of the software are provided (description of what part of the output has been kept,
what part has been rejected)
DataCollection/AnalysisData/SpectrumIdentificationList/SpectrumIdentificationResult/SpectrumIdentificationItem/   
@passThreshold