PSI-MS: Mass Spectrometry Standards Working Group
The PSI-MSS working group defines community data formats and controlled vocabulary terms facilitating data exchange and archiving in the the field of proteomics mass spectrometry.
Current projects are:
- The mzML format, which merges the mzData format (see below) and another similar format mzXML. mzML 1.1.0 was released on June 1, 2009 and has been stable since then. Everyone is encouraged to implement mzML 1.1.0 in their software. See the mzML information page for the full specification, other documentation and examples.
- The TraML format has been developed as a standardized format for the exchange and transmission of transition lists for selected reaction monitoring (SRM) experiments. This specification has been been accepted through the PSI document process and is complete. Please email the list firstname.lastname@example.org with your questions, comments, and suggestions. See the TraML information page for the full specification, other documentation and examples.
Past achievements are:
- The mzData standard, which captures mass spectrometry output data. mzData's aim is to unite the large number of current formats (pkl's, dta's, mgf's, .....) into a single format. mzData has been released and is stable at version 1.05. It is now deprecated in favor of mzML.
that the mzIndentML (formerly AnalysisXML) format now comes under the aegis of the Proteomics Informatics Working Group
Group Structure (2009)
|Chair||Eric Deutsch, Institute for Systems Biology|
Pierre-Alain Binz, Centre Universitaire Hospitalier Vaudois CHUV
Henry Lam, Hong Kong University of Science and Technology
Andrew Dowsey, University of Bristol
|MIAPE||Pierre-Alain Binz, Centre Universitaire Hospitalier Vaudois CHUV|
Gerhard Mayer, Ruhr-University Bochum
Other Working Group Members:
Steffen Neumann, IPB Hall
Florian Reisinger, European Bioinformatics Institute
David Creasy, Matrix Science Ltd.
Wilfred Tang, Applied Biosystems
Pete Souda, University of California, Los Angeles
Angel Pizarro, University of Pennsylvania
Sean Seymour, Applied Biosystems
As of 2006 there existed two separate XML formats for encoding raw spectrometer output: mzData developed by the PSI and mzXML developed at the Seattle Proteome Center at the Institute for Systems Biology. It is recognized that the existence of two separate formats for essentially the same thing generates confusion and extra programming effort. Therefore the PSI, with full participation by ISB, developed a new format intended to replace the previous two formats, by merging the best ideas from each format. This new format is called mzML. See the information page for mzML which includes current documentation, example files and other related materials. We encourage everyone to implement mzML in their software and workflows and cease using mzXML and mzData as soon as possible.
Controlled Vocabulary development: The PSI-MS CV
The PSI-MS Controlled Vocabulary is developped in common with the PSI-Proteomics Informatics group. It consists of a large collection of structured terms covering description and use of Mass Spectrometry instrumentation as well as Protein Identification and Quantitation software. The source of the terms are multiple: they include vocabulary and definitions in chapter 12 of the IUPAC nomenclature book, instrument and software vendors and developers and other user-submitted terms. Although its structure and use is linked to mzML, mzIdentML and mzQuantML, it is dynamically maintained in a OBO format.
The latest version of the PSI-MS CV is available here
More information on PSI CVs can be found here
The mzData standard
Status: version 1.05. Deprecated.
mzData in a nutshell
mzData is a data format capturing peak list information. Its aim is to unite the large number of current formats (pkl's, dta's, mgf's, .....) into one; mzData. mzData is NOT a substitute for the rawfile formats of the instrument vendors. Some vendors, if not all, will provide software transforming their raw files to mzData. There are already a number of programs which can use mzData. In order to keep the filesize of mzData limited, mz/intensity information is stored in "binary base 64 format".
Technical description and other resources
Questions? Consult the PSI-MS email discussion list
Moderated email discussion list
Last edited by: Pierre-Alain Binz 2014-04-09