Formal version 1.0.0 release
Direct Links to current documents:
- mzQuantML 1.0.0 Schema (xsd)
- Additional semantic validation rules for the different techniques are HERE, under Schema_rules_X
- PSI-MS Controlled Vocabulary (OBO File)
- Mapping file is HERE
- Specification Document (Microsoft Word format)
- Twenty minute guide to mzQuantML (introductory tutorial)
- Mapping to MIAPE Quant
- Mapping to MCP guidelines
find Example Instance Documents HERE
The mzQuantML standard format is intended to store the systematic description of workflows quantifying molecules (principly peptides and proteins) by mass spectrometry. A large number of different software packages are available that produce output in a variety of different formats. It is intended that mzQuantML will provide a common format for the export of identification results from any software package. The format was originally developed under the name AnalysisXML as a format for several types of computational analyses performed over mass spectra in the proteomics context. It has been decided to split development into two formats: mzIdentML for peptide and protein identification and mzQuantML (described here), covering quantitative proteomic data derived from MS.
The development of mzQuantML is driven by some general principles, specific use cases and the goal of supporting specific techniques, as listed below. These were discussed and agreed at the development meeting in Tübingen in July 2011.
General principles, the format SHOULD support:
- Journal requirement for the reporting of quantitative proteomic data from mass spectrometry.
- Reporting according to MIAPE-MSI (and the emerging MIAPE-Quant document).
- Submission of quantitative data to public databases.
- Data exchange between software tools, where data are defined as values about features (defined here as regions on MS1 mass spectra that report on a single peptide or small molecule), feature matches across different spectra or withing spectra, peptides, proteins and protein groups.
- Import of data into statistical processing tools.
- The ability to reprocess or recreate the analysis workflow using the same parameters, assuming no manual steps have taken place.
Use cases, the format SHOULD capture:
- Final abundance values (relative or absolute) for peptides, proteins and protein groups where protein inference cannot be performed in an unambiguous manner.
- Quantification values about peptide/protein modifications, such as post-translational modifications.
- Abundance values at the level of a single run (called an assay in this context) and logical groupings of runs (called study variables in this context), which the user, for example, wishes to report relative values for.
- The evidence trail for how final abundance values were calculated, such as the features used for quantifying peptides and proteins.
- Relationships between features either on different regions of the same spectrum or on different spectra that report on the same peptide or small molecule. These are particularly required for relative quantification approaches.
- Details about pre-fractionation sufficient to describe the combination of multiple input data files (e.g. raw files) into a single assay where this has been performed.
More documentation is available in the mzQuantML Google code project at http://code.google.com/p/mzquantml/.
The format supports the following specific techniques used in proteomics (as shown in examples files):
- MS1 label-free intensity
- MS1 label-based e.g. SILAC and metabolic labelling such as 15N
- MS2 tag-based e.g. iTRAQ / TMT
- MS2 spectral counting
We expect that the format MAY also be able to cover the following techniques adequately, although these have not been tested in great detail at this stage, and we encourage further input from users of these techniques:
- Quantification by selected reaction monitoring (SRM)
- Absolute quantification based on averaging the intensities of features e.g. Waters Hi3 technique
- Small molecule quantification (in metabolomics)
- MS2 intensity-based approaches
- MS2 label-based approaches
The standard was submitted to the PSI document process in August 2011. The specifications have since been updated through version 1-rc2 and version 1-rc3 (current), with the release of version 1.0.0 in Feb 2013.
Major changes in versions (also see versioned schema documents on Google Code):
- rc1 to rc2. Introduction of mapping rules/semantic rules for different techniques.
- Schema change log: https://code.google.com/p/mzquantml/source/list?path=/trunk/schema/mzQuantML_1_0_0-rc2.xsd&start=243
- rc2 to rc3: Minor updates in responses to reviewer comments from journal review and fixes for cardinalities/internal references etc.
- Schema change log: https://code.google.com/p/mzquantml/source/list?path=/trunk/schema/mzQuantML_1_0_0-rc3.xsd&start=352
- rc3 release to version1.0.0 release: no changes except update to version number.
The overall resource change log can also be consulted here: https://code.google.com/p/mzquantml/source/list