Skip to content

mzML

History

From 2005-2008 there existed two separate XML formats for encoding raw spectrometer output: mzData developed by the PSI and mzXML developed at the Seattle Proteome Center at the Institute for Systems Biology (ISB). It was recognized that the existence of two separate formats for essentially the same thing generated confusion and required extra programming effort. Therefore the PSI, with full participation by ISB, developed a new format called mzML by taking the best aspects of each of the precursor formats to form a single one. It is intended to replace the previous two formats, which are now deprecated, although still sometimes used by older software.

On 2008-06-01, mzML 1.0.0 was released. In early 2009, several implementation efforts identified a few minor shortcomings in mzML 1.0.0. Since no vendors had yet released software supporting mzML 1.0 yet, the working group decided to release an update in June 2009. It is expected that all software will support mzML 1.1 as the long-term-stable format instead of 1.0. Below is the available documentation for mzML 1.1.0 and related information. Please send feedback to psidev-ms-dev@lists.sourceforge.net.

Status

mzML 1.1.0 was released on 2009-06-01 and has been stable every since. There were initial plans to update a new 1.2 release to support ion mobility mass spectrometry (IM-MS) and data-independent acquisition (DIA) MS. However, as of 2022-11-21, it appears that support for IM-MS and DIA can be achieved without a schema change, with just some additional terms. Please contact psidev-ms-dev@lists.sourceforge.net for more information.


Proposed mzML best practices for encoding IM-MS and DIA data in mzML

(updated 2022-11-21)

The following proposal for adding IM-MS and DIA support has not yet entered the PSI Document Process. Comments to psidev-ms-dev@lists.sourceforge.net or an issue to the GitHub repo are welcome. Some parts of this have already been implemented in ProteoWizard.


    mzMLb

    One drawback of mzML is that while it compresses well, compressing greatly reduces random access performance. We propose an alternative based upon HDF5 that embeds an mzML XML document but stores each type of binary data array as separate datasets. HDF5’s chunked compression method makes random access to compressed data orders of magnitude faster than for compressed mzML. See Bhamber et al. 2021 for more details.

    References

    • Bhamber, R. S., Jankevics, A., Deutsch, E. W., Jones, A. R., & Dowsey, A. W. (2021). mzMLb: A Future-Proof Raw Mass Spectrometry Data Format Based on Standards-Compliant mzML and Optimized for Speed and Storage Requirements. Journal of Proteome Research, 20(1), 172–183. https://doi.org/10.1021/acs.jproteome.0c00192

    mzML Release Schedule

    (updated 2022-11-21)

        • 2008-06-01 mzML 1.0.0 released

        • 2009-06-01 mzML 1.1.0 released

        • 2010-06-01 mzML index wrapper schema updated to 1.1.1

        • 2022-11      Minor updates to CV still occur, but no new schema changes are planned at this time
        • 2023-03-17 mzML schema adjusted to 1.1.1 to allow the schema to tolerate common identifiers better


      mzML 1.1.0 Finished Specification

      (updated 2024-02-09)

      The information and documents in this subsection are related to mzML 1.1.0, revised after going through the PSI document process on May 19, 2009. Everyone is encouraged to implement mzML 1.1.0. It is hoped that mzML 1.1.0 will remain stable for a long time.

      NOTE: On 2010-06-01, the mzML index schema was updated from 1.1.0 to 1.1.1. There was no functional change, but rather the addition of an enumeration constraint to an attribute to prevent creative, unintended values. This could cause some files that previously validated to no longer validate. However, any such files should never have successfully validated in the first place.

      XML schema definition files:

      – mzML1.1.1.xsd (main schema)

      – mzML1.1.3_idx.xsd (separate and optional index)

      – Latest mapping file, which defines where certain controlled vocabulary terms may be used in a document.

      – Latest version of the controlled vocabulary (CV) in OBO 1.2 format.  (OBO-Edit)

      – Latest version of the controlled vocabulary (CV) in OWL format.

      Documentation files:

      – Full Specification Document: mzML1.1.0_specificationDocument.doc

      – HTML schema documentation for mzML 1.1.0

      – HTML schema documentation for mzML 1.1.0 index wrapper schema

      Validation of mzML files

      Although at one time there were on-line mzML validators, these have fallen into disrepair and are no longer functional.

      You can download and run a local validator.

      – The OpenMS validator can be installed locally by downloading and installing OpenMS.

      – The Java-based validator can be downloaded from GitHub

      Sample instance documents for all relevant formats:

      All documents are meant to contain equivalent information in the various formats.

      – tiny1.mzML1.1.0.mzML
      – tiny1.mzData1.05.xml

      – tiny1.mzXML2.0.mzXML
      – tiny1.mzXML3.0.mzXML

      Sample files generated by the ProteoWizard:

      – small.RAW (a small Thermo RAW file with LTQ-FT data)

      – small.pwiz.1.1.mzML (converted from small.RAW by msconvert)

      – small_miape.pwiz.1.1.mzML (converted by msconvert, with example MIAPE fields added programatically)

      – small_zlib.pwiz.1.1.mzML (converted by msconvert, with zlib compression and 32-bit precision)

      Other sample files:

       – PDA example file (createdby Steffen Neumann)

      – Sample files generated by the Proteios Software Environment

      Other relevant websites:

      – HUPO-PSI GitHub mzML

      – General PSI guidelines for creating controlled vocabularies

       Current and future support for mzML:
      (updated 2013-02-19)

      Product Source Contact  Support comments
      ProteoWizard USC Parag Mallick Full mzML support today
      TPP ISB Eric Deutsch Full mzML support today (including embedded X!Tandem)
      Insilicos Viewer Insilicos Erik Nilsson Full mzML support today
      X!Tandem GPM Ron Beavis Full mzML support today
      Myrimatch Vanderbilt Matt Chambers Full mzML support today
      InSilicoSpectro SIB Alex Masselot Full mzML support today
      Proteios SE Univ Lund Fredrik Levander Full mzML support today
      NCBI C++ toolkit NCBI Douglas Slotta available in next release
      OpenMS/TOPP Univ Tübingen Marc Sturm Full mzML support today
      Phenyx GeneBio Pierre-Alain Binz Full mzML support today
      Mascot Matrix Science David Creasy Full mzML support today
      Mascot Distiller Matrix Science David Creasy Full mzML support today
      jmzML Ghent/ EMBL-EBI Lennart Martens Full mzML support today
      Conversion tool in Proteomics Toolbox Thermo Scientific Jim Shofstahl beta testing
      ReAdW (.RAW converter) ISB Eric Deutsch Replaced by ProteoWizard msconvert
      mzWiff (.wiff converter) ISB Eric Deutsch Replaced by ProteoWizard msconvert
      MassWolf (.raw/ converter) ISB Eric Deutsch Replaced by ProteoWizard msconvert
      Trapper (Agilent data converter) ISB Eric Deutsch Replaced by ProteoWizard msconvert
      mzML_Exporter ABI Sean Seymour beta testing
      CompassXport Bruker ? ?
      PEAKS Bioinformatics Solutions Inc Kevin Zhang Beta Testing
      PRIDE database EMBL-EBI Juan A. Vizcaino ongoing
      PRIDE Inspector EMBL-EBI Juan A. Vizcaino Full mzML support
      MIAPE MS Extractor ProteoRed Salvador Martinez-Bartolome Full mzML support
      mzR Bioconductor Bernd Fischer, Steffen Neumann, Laurent Gatto Full mzML support
      pymzML Univ Münster Christian Fufezan Full mzML support
      Crux University of Washington W. Noble Full mzML support


      Released mzML 1.0.0 Specification

      (updated 2009-02-10)

      The information and documents below related to mzML 1.0.0, which is now obsoleteDo not use it.

      Current xml schema definition files (.xsd):

      – mzML1.0.0.xsd (main schema)

      – mzML1.0.0_idx.xsd (separate and optional index)

      Documentation files:

      – Full Specification Document: mzML1.0.0_specificationDocument.doc

      – HTML schema documentation for mzML 1.0.0

      – HTML schema documentation for mzML 1.0.0 index wrapper schema

      – ASMS June 2008 Poster (3MB PDF)

      Tags

      Mass Spectrometry

      Specifications