Proteomics Standards Initiative
Molecular Interaction XML Format Documentation
Version 2.5
Released 2005, Last maintenance update to version 2.5.4
Version 3.0
Available for use now, estimated formal release – Spring 2016
Table of Contents
- Introduction
- Purpose of the PSI-MI XML format
- Purpose of this document
- Directory structure
- Release schedule
- Changes from PSI-MI 1.0 to 2.5
- Maintenance releases
- Detailed Documentation
- Use of external controlled vocabularies
- List of planned features
- How to comment
- Available data
- Tools
- Data submission
- Further information and relevant links
Introduction
The Proteomics Standards Initiative (PSI) aims to define community standards for data representation in proteomics to facilitate data comparison, exchange and verification. For detailed information on all PSI activities, please see PSI Home Page.
The PSI-MI interchange format and accompanying controlled vocabularies was originally designed by a consortium of molecular interaction data providers from both academia and industry, including BIND, DIP, IntAct, MINT, MIPS, GlaxoSmithKline, CellZome, Hybrigenics, Universities of Bielefeld, Bordeaux, Cambridge, and others. It is maintained, and kept fit for purpose by the Molecular Interaction workgroup of the HUPO PSI. Please contact us on psi-mi@ebi.ac.uk if you wish to become involved or have any questions.
Purpose of the PSI MI XML format
The PSI MI format is a data exchange format for molecular interactions. It is not a proposed database structure.
Purpose of this document
The purpose of this document is to describe the general structure of the PSI MI XML specification in a more user-friendly manner than the specification does itself. For the detailed and most up-to-date description of PSI-MI XML2.5 please see the molecular interaction data exchange format, level 2.5. For the publication describing the development and use of this format, please look here. For documentation of the previous level 1.0 please see Version 1.0 Documentation. For level 3.0 please go to Version 3.0 documentation. This documentation will also provide additional information, e.g. sample data.
The XML schema is located at https://rawgit.com/MICommunity/psidev/master/psi/mi/rel25/doc/MIF254.html
Directory structure
This document is in the root directory of the PSI-MI XML2.5 release. Subdirectories are
doc/Auto-generated documentation of the PSI-MI XML schema
src/Source code for schema and related software
data/Controlled vocabularies
tools/Data management tools
Release schedule
- Level 3.0 will be published in 2016.
- Level 2.5 was released 5 December 2005. It is the format most commonly supported by PSI-compliant databases and tools. It will continue to be supported by the MI workgroup after the publication of PSI-MI XML3.0.
- Level 1.0 support was discontinued in 2007.
Changes from PSI-MI XML1.0 to 2.5
Changes in the PSI-MI XML format and controlled vocabularies from version 1.0 to 2.5 are documented in this page.
PSI-MI XML2.5 Maintenance releases
- 2.5.4
Minor change to header
- 2.5.3:
Minor updates as a result of the PSI spring meeting in San Francisco, April 2006:- updated entrySet@minorVersion to 3
- bioSourceType@taxId now mandatory. This was inadvertedly made non-mandatory with 2.5.2.
- featuretype@id now mandatory. This was inadvertedly made non-mandatory with 2.5.2.
- Optional attribute parameter/uncertainty added.
- Added participant/parameterList and participant/attributeList to allow more complex modelling of participants.
- Deleted XML constraints on entry level. They were not working due to syntax errors, and few XML validators can check them. This validation level will be performed by the PSI XML validator in the future.
- 2.5.2:
There were some inconsistencies in the naming of complex types in 2.5.1. These have been fixed. This has no impact on the XML data files. The only impact for users is a facilitation if they use code generators. Concrete changes:- complex type interactorType to interactorElementType
- complex type interactionType to interactionElementType
- complex type featureType to featureElementType
- featureType@id has been moved into the complex type featureElementType and typed as xs:int
- bioSourceType@ncbiTaxId has been moved into the complex type bioSourceType and typed as xs:int
- updated entrySet@minorVersion to 2
- 2.5.1:
At the PSI meeting in Geneva, September 2005, it was discussed that participant/experimentalFormList/experimentalForm should have the possibility to assign a position, e.g. to describe an n-terminal protein modification in an experiment. It was decided to implement this using the existing featureType. The controlled vocabulary has been updated accordingly, but the change in the XML schema was not implemented. This required the maintenance release 2.5.1, with the following changes:- Added entrySet@minorVersion, fixed to 1
- Deleted participant/experimentalFormList/experimentalForm
Detailed Documentation
see https://github.com/MICommunity/psimi/blob/wiki/PsimiXMLSpecifications.md
Use of external controlled vocabularies
Where possible, external controlled vocabularies are referenced from PSI MI. External controlled vocabularies are used in two forms:
- Open controlled vocabularies: We think that no existing controlled vocabulary provides all necessary terms for the given attribute in the PSI MI format. In this case, it is up to the data provider to choose a controlled vocabulary, or to provide a free text string if no appropriate controlled vocabulary exists.
- Closed controlled vocabularies: We think that there is a controlled vocabulary which appropriately covers all necessary terms for the given attribute. In this case, only terms from the defined vocabulary should be used.
Data
The closed controlled vocabularies referenced by PSI MI are listed in the table below. All vocabularies are contained in a files in OBO flat file format: psi-mi25.obo. They can be browsed at the EBI OLS (Open Lookup Service). The correctness of references to external controlled vocabularies is currently not enforced by the PSI MI schema. It is the responsibility of the data provider to ensure that only existing terms at an up-to-date data source are referenced.
PSI MI XML schema elements and OBO major terms
PSI MI XML level 2.5 data element | term name | PSI-MI identifier |
experimentType/participantIdentificationMethod | participant identification method | MI:0002 |
experimentType/interactionDetectionMethod | interaction detection method | MI:0001 |
interactionElementType/interactionType | interaction type | MI:0190 |
interactionElementType/participantList /participant/biologicalRole |
biological role Example: enzyme |
MI:0500 |
interactionElementType/participantList/participant/ experimentalPreparationList/experimentalPreparation |
experimental preparation | MI:0346 |
interactionElementType/experimentalRoleList/ experimentalRole |
experimental role Example: bait |
MI:0495 |
featureType/featureDetectionMethod | feature detection method | MI:0003 |
featureType/featureType | feature type | MI:0116 |
‘featureType/featureRangeList/featureRange/ baseLocationType/startStatus/’and ‘../endStatus/ |
feature range status | MI:0333 |
interactorType/interactorType | interactor type | MI:0313 |
xrefType/*/dbAc | database citation | MI:0444 |
xrefType/*/refTypeAc | refType | MI:0353 |
namesType/alias/typeAc | alias type | MI:0300 |
attributeListType/attribute/nameAc | attribute name | MI:0590 |
Obsolete terms
The OBO format has a special class “obsolete”, to which all obsolete PSI MI terms are assigned.
Mapping from OBO to MIF25 format
We recommend the following mapping from the file psi-mi25.obo to PSI MI 2.5 XML files:
OBO format element | PSI MI 2.5 XML file element |
id | cvType/xref/primaryRef/id |
name | cvType/names/fullName |
exact_synonym | cvType/names/shortLabel |
synonym | cvType/names/alias |
Feedback
Because we are following a leveled approach, we are interested in knowing what the community wishes to be included in the next level. If you have a use case not covered by the current schema or controlled vocabulary terms, please contact us at psi-mi@ebi.ac.uk.
Tools
- PSI-MI XML 2.5 Java Parser : to read and write interaction data from and to file.
- PSI-MITAB 2.5 Java Parser : to read and write interaction data in tab-delimited format.
- XMLMakerFlattener to convert PSI MI XML format into tab-delimited ASCII format (flat-files) and vice versa.
- The PSI MI controlled vocabularies can be browsed at the EBI OLS (Ontology Lookup Service)
- PSI XML Validator: Semantic validator for PSI MI files. It validates correct use of PSI MI ontologies in a data file, plus additional semantic consistency rules. The current beta version can be downloaded from https://sourceforge.net/projects/psidev/files/Schema%20Validator/
- MIF25_compact.xsl : conversion from the expanded to the compact form of the PSI 2.5 format.
- MIF25_expand.xsl : conversion from the compact to the expanded form.
- MIF25_view.xsl : conversion from xml into “draft” html.
Data submission
We strongly support and encourage data deposition supporting published journal articles in public databases of the IMEx consortium. Please see the IMEx deposition page for contact details and deposition options.
Sandra Orchard, orchard@ebi.ac.uk, 11-DEC-2015