The Minimum Information About a Proteomics Experiment (MIAPE)

Reporting guidelines for proteomics

The context-sensitive nature of transcriptome, metabolome and proteome data necessitates the capture of a richer set of metadata (data about the data) than is required for basic genetic sequence, where usually knowing the organism of origin will suffice. The use of paper citations as proxies for actual metadata hinders the reassessment of data sets, and obstructs non-standard searching (e.g. by the order that different liquid chromatography columns were coupled). The requirements of the various journals also differ, so important detail may be lacking in some cases, or presented in an esoteric fashion.

There is then a need for public repositories that contain information from whole proteomics experiments; making explicit both where samples came from, and how analyses of them were performed. It is therefore appropriate to attempt to define the minimum set of information about a proteomics experiment that would be required by such a repository.

Up-to-date released documents and working drafts

Acronym        TopicVersion  StatusComment
MIAPE MIAPE Principles document1.0release 
MIAPE-MSMass Spectrometry2.98releasereplaces v2.24
MIAPE-MSIMass Spectrometry Informatics1.1release 
MIAPE-QuantMass Spectrometry Quantification  1.0release 
MIAPE-GEGel Electrophoresis1.4release 
MIAPE-GIGel Informatics1release 
MIAPE-CCColumn Chromatography1.1release 
MIAPE-CECapillary Electrophoresis0.9.3release 
MIMIxMolecular Interactions1-1-2release 
More details and history of the documents are available from the MIAPE-docs page. 

Published components of MIAPE:

The 'MIAPE Principles' document:
The minimum information about a proteomics experiment (MIAPE), Nature Biotechnology 25, 887-893 (2007)
-- DOI -- PubMed.
(Free PDF/HTML download direct from Nature Biotechnology.)

Reporting guidelines for Molecular Interactions experiments:
The minimum information required for reporting a molecular interaction experiment (MIMIx)
, Nature Biotechnology 25, 894-898 (2007) -- DOI -- PubMed.
(Free PDF/HTML download direct from Nature Biotechnology.)

Reporting guidelines for Mass Spectrometry experiments:
Guidelines for reporting the use of mass spectrometry in proteomics, Nature Biotechnology 26, 860-861 (2008) -- DOI -- PubMed.
(Free PDF/HTML download direct from Nature Biotechnology.)

Reporting guidelines for Mass Spectrometry Informatics experiments:
Guidelines for reporting the use of mass spectrometry informatics in proteomics, Nature Biotechnology 26, 862 (2008) -- DOI -- PubMed.
(Free PDF/HTML download direct from Nature Biotechnology.)

Reporting guidelines for Gel Electrophoresis experiments:
Guidelines for reporting the use of gel electrophoresis in proteomics, Nature Biotechnology 26, 863 (2008) -- DOI -- PubMed.
(Free PDF/HTML download direct from Nature Biotechnology.)

Reporting guidelines for Gel Informatics experiments:
Guidelines for reporting the use of gel image informatics in proteomics, Nature Biotechnology 28, 655-656 (2010) -- DOI -- PubMed.
(Free PDF/HTML download direct from Nature Biotechnology.)

Reporting guidelines for Column Chromatography experiments:
Guidelines for reporting the use of column chromatography in proteomics, Nature Biotechnology 28, 654 (2010) -- DOI -- PubMed.
(Free PDF/HTML download direct from Nature Biotechnology.)

Reporting guidelines for Capillary Electrophoresis experiments:
Guidelines for reporting the use of capillary electrophoresis in proteomics, Nature Biotechnology 28, 654-655 (2010) -- DOI -- PubMed.
(Free PDF/HTML download direct from Nature Biotechnology.)

Guidelines for reporting quantitative mass spectrometry based experiments in proteomics. Journal of Proteomics 2013 Mar 14-- DOI -- PubMed.

Up-to-date released documents and working drafts available from this site here

MIAPE is now registered with the MIBBI Project (Minimum Information for Biological and Biomedical Investigations), which promotes and coordinates the development and management of Minimum Information (MI) specifications from across the biological and biomedical sciences.


ProteoRed: see bioinformatics team page.


mzIdentML Conformance to MIAPE

This table lists each point in the MIAPE guidelines and states the xpath/CV available to provide conformance

Do not edit this page directly because the editor on psidev.info is useless for tables. Source for this page is under svn here. You should edit the source html and copy/paste to here.

The MIAPE document is available here, and general information about MIAPE is here.

MIAPE SectionItemxPath (under mzIdentML)Notesabcdefghij
1Date stamp (as YYYY-MM-DD)creationDate (attribute)The creation date of the document itself. xsd:dateTime  
AnalysisCollection/SpectrumIdentification/activityDate (attribute)Date spectrum identification performed. xsd:dateTime 
AnalysisCollection/ProteinDetection/activityDate (attribute)Date protein inferencing performed. xsd:dateTimenana
Responsible person (or institutional role if more appropriate); provide name, affiliation and stable contact informationProvider/ContactRoleAn institutional email address can generally satisfy this requirement.  
Software name, version and manufacturerAnalysisSoftwareList/AnalysisSoftware/name 
Customisations made to that softwareAnalysisSoftwareList/AnalysisSoftware/CustomizationsNo customisations in some examples for illustration.
In the other cases this is just not applicable (na).
Availability of that softwareAnalysisSoftwareList/AnalysisSoftware/URIThe references of the vendor or public url if a publicly available version has been used.  
Location of the files generated; parameter files, spectral data (input/output)DataCollection/Inputs/SourceFileThe location of the data generated. If made available in a public repository, describe the URI (for instance an url, or the url of the repository and the information on how to retrieve the data). If not made available for public access, describe the contact person reference or source and the internal coordinates of the data. e.g. Sequest .out, Mascot .dat. [Note to MIAPE Authors: This is confusing because of overlap with next section, so we just consider Inputs/SourceFile here and not the .dta files etc.]. 
2Input data – Description and type of MS dataDataCollection/Inputs/SpectraDataProvide a short description that can refer to the data in the experiment (e.g. LC-MS run1). [Refer to mzML source file for information - outside scope of mzIdentML]          
Input data – Availability of MS data (source of data)DataCollection/Inputs/SpectraDataLocation (URI) of input data file 
Input parameters - Databases queried; description and versions (including number of entries searched)DataCollection/Inputs/SearchDatabase/DatabaseName
DataCollection/Inputs/SearchDatabase/numDatabaseSequences na
Input parameters - Taxonomical restrictions appliedAnalysisProtocolCollection/SpectrumIdentificationProtocol/DatabaseFiltersSpecify the ... subset of the databank(s) (for instance, “mammals”, a NCBI TaxId, a list of accession numbers).nananananananana
DataCollection/AnalysisData/SpectrumIdentificationList/numSequencesSearchedSpecify the number of entries searched.nananananananana
Input parameters - Description of tool and scoring schemeAnalysisProtocolCollection/SpectrumIdentificationProtocol/AdditionalSearchParams/cvParamDescriptor of the scoring algorithm in the search engine (such as ESI-TRAP in Mascot, ESI... [Note to MIAPE authors: These examples parameters are a little search engine specific]
Input parameters - Specified cleavage agent(s)AnalysisProtocolCollection/SpectrumIdentificationProtocol/EnzymesDescribe the cleavage agent as available on the search engine. If the cleavage agent rules have been defined by the user, describe the cleavage rules)
Input parameters - Allowed number of missed cleavagesAnalysisProtocolCollection/SpectrumIdentificationProtocol/Enzymes/Enzyme/missedCleavagesAllowed maximum number of cleavage sited missed by the specified agent during the in-silico cleavage process. For a no eznyme search, use the "No Enzyme" CV term, and omit the number of missed cleavages.
Input parameters - Additional parameters related to cleavageAnalysisProtocolCollection/SpectrumIdentificationProtocol/EnzymesThe Enzymes section is flexible. Example 'a' shows a case of a mixed enzyme.nanananananananana
Input parameters - Permissible amino acids modificationsAnalysisProtocolCollection/SpectrumIdentificationProtocol/ModificationParams/SearchModificationUsing the PSI-MS names available from Unimodnananana
Input parameters - Precursor-ion and fragment ion mass tolerance for tandem MS (when applicable)AnalysisProtocolCollection/SpectrumIdentificationProtocol/FragmentTolerance
Input parameters - Mass tolerance for PMF (when applicable)AnalysisProtocolCollection/SpectrumIdentificationProtocol/ParentTolerance nanananananananana
Input parameters - Thresholds; minimum scores for peptides, proteins (probabilities, number of hits, other metrics)AnalysisProtocolCollection/SpectrumIdentificationProtocol/AdditionalSearchParams/cvParam          
AnalysisProtocolCollection/ProteinDetectionProtocol/AnalysisParams/cvParam na  na
Input parameters - Any other relevant parametersAnalysisProtocolCollection/SpectrumIdentificationProtocol/AdditionalSearchParams/cvParam 
3Identified proteins - Accession code in the queried databaseSequenceCollection/DBSequence/accession 
Identified proteins - Protein descriptionSequenceCollection/DBSequence/cvParam accession="MS:1001088" na  na
Identified proteins - Protein scoresDataCollection/AnalysisData/ProteinDetectionList/ProteinAmbiguityGroup/ProteinDetectionHypothesis/cvParam nanana
Identified proteins - Validation statusDataCollection/AnalysisData/ProteinDetectionList/ProteinAmbiguityGroup/ProteinDetectionHypothesis/cvParam accession="MS:1001060"For all protein hits in the search, specify if accepted without post-processing of search engine/de-novo interpretation (accept raw output of identification software) or if manually accepted as valid or as rejected (false positive).     na   na
Identified proteins - Number of different peptide sequences (without considering modifications) assigned to the proteinDataCollection/AnalysisData/ProteinDetectionList/ProteinAmbiguityGroup/ProteinDetectionHypothesis/cvParam
 na  na
Identified proteins - Percent peptide coverage of proteinDataCollection/AnalysisData/ProteinDetectionList/ProteinAmbiguityGroup/ProteinDetectionHypothesis/cvParam
 na  na
Identified proteins - Identity of supporting peptidesDataCollection/AnalysisData/ProteinDetectionList/ProteinAmbiguityGroup/ProteinDetectionHypothesis/PeptideHypothesis nana
Identified proteins - In the case of PMF, number of matched/unmatched peaksDataCollection/AnalysisData/ProteinDetectionList/ProteinAmbiguityGroup/ProteinDetectionHypothesis/cvParam
accession="MS:1001097" name="distinct peptide sequences"
accession="MS:1001362" name="number of unmatched peaks"
For identified peptides - Sequence (indicate any deviation from the expected protein cleavage specificity)SequenceCollection/Peptide/peptideSequence 
For identified peptides - Peptide scoresDataCollection/AnalysisData/SpectrumIdentificationList/SpectrumIdentificationResult/SpectrumIdentificationItem/cvParam na
For identified peptides - Chemical modifications (artefactual) and post-translational modifications (naturallyoccurring); sequence polymorphisms with experimental evidence (particularly for isobaric modifications)SequenceCollection/Peptide/Modification nanana
For identified peptides - Corresponding spectrum locusDataCollection/AnalysisData/SpectrumIdentificationList/SpectrumIdentificationResult/SpectrumIdentificationItem/start and end   
For identified peptides - Charge assumed for identification and a measurement of peptide mass errorDataCollection/AnalysisData/SpectrumIdentificationList/SpectrumIdentificationResult/SpectrumIdentificationItem/chargeState  
DataCollection/AnalysisData/SpectrumIdentificationList/SpectrumIdentificationResult/SpectrumIdentificationItem/calculatedMassToCharge - DataCollection/AnalysisData/SpectrumIdentificationList/SpectrumIdentificationResult/SpectrumIdentificationItem/experimentalMassToCharge  
For identified peptides - Other additional information, when used for evaluation of confidenceDataCollection/AnalysisData/SpectrumIdentificationList/SpectrumIdentificationResult/SpectrumIdentificationItem/cvParam 
Quantitation for selected ions - Quantitation approach (e.g. 4plex-iTRAQ, ICAT, cICAT, COFRADIC)Out of scopePlanned for mzQuantML          
Quantitation for selected ions - Quantity measurement (e.g. integration of signals, use of signal intensity)Out of scopePlanned for mzQuantML          
Quantitation for selected ions - Data transformation and normalisation technique (description of method and software)Out of scopePlanned for mzQuantML          
Quantitation for selected ions - Number of replicates (biological and technical)Out of scopePlanned for mzQuantML          
Quantitation for selected ions - Acceptance criteria (including measure of errors)Out of scopePlanned for mzQuantML          
Quantitation for selected ions - Estimates of uncertainty and the methods for the error analysis, including the treatment of relevant systematic error effects and the treatment of random error issues. Results from controls (when described)Out of scopePlanned for mzQuantML          
4Assessment and confidence given to the identification and quantitation (description of methods, thresholds, values, etc,)AnalysisProtocolCollection/SpectrumIdentificationProtocol/Threshold
For example, MS:1001316, mascot:SigThreshold  
4Results of statistical analysis or determination of false positive rate in case of large scale experiments          
4Inclusion/exclusion of the output of the software are provided (description of what part of the output has been kept, what part has been rejected)DataCollection/AnalysisData/SpectrumIdentificationList/SpectrumIdentificationResult/SpectrumIdentificationItem/ @passThreshold 


Subscribe to RSS - MIAPE