Proforma v2 in DocProc [handling editor: Sylvie Ricard-Blum]
ProForma (Proteoform and Peptidoform Notation)
Protein and peptide sequences are usually represented using a string of amino acids using a well-known one letter code endorsed by the IUPAC. However, there is still no clear consensus about how to represent ‘proteoforms’ and ‘peptidoforms’, meaning all possible variations of a protein/peptide sequence, including protein modifications, both artefactual and post-translational modifications (PTMs). There are indeed multiple ways of encoding mass modifications and extended discussion has taken place to achieve a consensus. A standard notation for proteoforms and peptidoforms is then required for the community, so that it can be embedded in many relevant PSI (and potentially other) file formats.
The PSI has developed a format called PEFF (PSI Extended FASTA Format, http://www.psidev.info/peff) that can be used to represent proteoforms. Additionally, the Consortium for Top Down Proteomics (CTDP) developed a notation format called ProForma (https://topdownproteomics.github.io/ProteoformNomenclatureStandard/), aiming to represent proteoforms.
This format specification represents the consortium reached by both groups in order to standardise the representation of proteoforms/peptidoforms supporting the main proteomics approaches, including both bottom-up (focused on peptides/peptidoforms) and top down (focused on proteins/proteoforms) approaches.
ProForma v2 aims to standardise the representation of proteoforms/peptidoforms supporting the main proteomics approaches, including both bottom-up (focused on peptides/peptidoforms) and top down (focused on proteins/proteoforms) approaches.
The specification document, ProForma (Proteoform and Peptidoform Notation, version 2.0, draft 12) was submitted to the PSI document process on October 24th, 2020. After having passed a 30-day review of the PSI steering group with minor changes, a revised version of the proposal (draft 13) was submitted on December 2020 and went through 60-days public comments and external review phase. Following reviewer comments a revised version of the proposal (draft 14) was submitted on November 29th 2021, and sent to reviewers for approval. Minor changes were made following additional comments of one reviewer and the status of the revised document (draft 15) was changed from Draft to Final on February 3rd, 2022.
Status (updated 2022-03-13) : Final (February 3rd, 2022)
- The final version of the specification document Proforma 2.0 is available here
- The specification document and associated files are available in the GitHub repository, at https://github.com/HUPO-PSI/ProForma/tree/master/SpecDocument
Other available documents
- Proteomics Standards Initiatives ProForma 2.0 Unifying the encoding of Proteoforms and Peptidoforms arXiv:2109.11352 [q-bio.BM] (or arXiv:2109.11352v1 [q-bio.BM] for this version) https://doi.org/10.48550/arXiv.2109.11352
- Journal article: article entitled "Proteomics Standards Initiative’s ProForma 2.0: Unifying the Encoding of Proteoforms and Peptidoforms" published in the Journal of Proteome Research. Publication Date: March 15, 2022. https://doi-/10.1021/acs.jproteome.1c00771. PMID: 35290070.
Sylvie RICARD-BLUM - PSI Editor