The PSI Extended Fasta Format (PEFF) is a ratified PSI standard that provides a unified format for protein and nucleotide sequence databases to be used by sequence search engines and other associated tools (spectra library search tools, sequence alignment software, data repositories, etc). This format enables consistent extraction, display and processing of information such as protein/nucleotide sequence database entry identifier, description, taxonomy, etc. across software platforms. It also allows the representation of structural annotations such as post-translational modifications, mutations and other processing events. The format has the form of a plain text file that extends the formalism of the individual sequence entries as presented in the FASTA format and that includes a header of meta data to describe relevant information about the database(s) from which the sequences have been obtained (i.e., name, version, etc). The format is named PEFF (PSI Extended FASTA Format). Sequence database providers are encouraged to generate this format as part of their release policy or to provide appropriate converters that can be incorporated into processing tools.
Status
The specification has completed its journey through the PSI document process and has been ratified and released.
Available Materials
- Journal article describing PEFF: in PubMed, JPR, PubMedCentral, bioRxiv, preprint. Please cite PMID:31081335 if you use PEFF.
- Main specification document
- Main GitHub page with most relevant materials: https://github.com/HUPO-PSI/PEFF
- Current and earlier specification documents
- Online PEFF Validator – Upload a prospective PEFF file and see validation status
- Downloadable Perl PEFF Validator – Validate PEFF files locally with this Perl library
Current Implementations
Info Date | Product | Detail Link | Comment |
---|---|---|---|
2019-01-10 | neXtProt | Download | Exports all curated PTMs and nsSNPs into PEFF compliant with PEFF1.0_DRAFT28 |
2019-01-11 | UniProt | Proteins API variation services | Exports nsSNPs for requested UniProtKB entries |
2019-03-25 | Pyteomics 4.0 | peff class | . |
2019-01-10 | Proteomics::PEFF | Example Tutorial | Converts FASTA file into PEFF, or alters existing PEFF files, compliant with PEFF1.0_DRAFT31 |
2019-01-10 | Proteoformer | Determines RIBO-seq derived proteoforms and can write its output in PEFF, compliant with PEFF1.0_DRAFT28 | |
2022-11-24 | PrecisionProDB | Python package for proteogenomics, which can generate a customized protein database for peptide search in mass spectrometry |
Info Date | Product | Detail Link | Comment |
---|---|---|---|
2019-01-10 | Comet | PEFF Parameters | Searches an MS run using a PEFF file as a reference, can search for known nsSNPs and PTMs, compliant with PEFF1.0_DRAFT28 |
2019-06-19 | Protein Prospector | PEFF usage poster | Searching PEFF databases supported from v5.18.0 (9/2016). PEFF1.0_DRAFT28 supported from v5.24.0 (6/2019). Variable modifications can be restricted to sites specified by \ModRes* |
2019-03-25 | Pyteomics 4.0 | peff class | . |
2019-01-10 | ProteinPilot | Searching PEFF databases supported in ProteinPilot V5.0 (released 2014) onward | |
2019-01-10 | Online Validator | Upload and Validate | Accepts PEFF upload and validates that PEFF is compliant with PEFF1.0_DRAFT31 |
2019-01-10 | Proteomics::PEFF | Example Tutorial | Validates, reads, and writes PEFF, compliant with PEFF1.0_DRAFT31 |
2019-01-10 | phpMs | Supports the use and viewing of PEFF files, compliant with PEFF1.0_DRAFT.. | |
2019-01-10 | ProteoMapper | On-line version | Supports searching variation in PEFF files given a list of input peptides, compliant with PEFF1.0_DRAFT31 |
2019-01-10 | TPP | Full support for viewing of PEFF reference proteome is planned for second quarter 2019. | |
In the above tables, the Info Date column represents the date on which the information in that row was judged to be up-to-date (or actively updated). The Product column is the name of and a hyperlink to the resource supporting PEFF. The Detail Link column provides one or more hyperlinks to details of PEFF support in the product if available.
TO DO Items:
- Promote additional implementations
- Assess what happens/should happen with duplicate keys (e.g. two \VariantSimple in the same record). Is that a validation error? or just concatenate?