About
The SDRF-Proteomics format is a tab-delimited format that describes the sample characteristics and the relationships between samples and data files included in a dataset. The information in SDRF files is organized to follow the natural flow of a proteomics experiment.

SDRF-Proteomics structure.
The main requirements to be fulfilled for the SDRF-Proteomics format are:
- The SDRF file is a tab-delimited format where each row corresponds to a relationship between a Sample and a Data file.
- Each column MUST correspond to an attribute/property of the Sample or the Data file.
- Each value in each cell MUST be the property for a given Sample or Data file.
- The SDRF file must start with columns describing the properties of the sample (e.g.organism, disease, phenotype etc), followed by the properties of data files which were generated from the analysis of the experimental results (e.g. label, faction identifier, data file etc).
- Support for handling unknown values/characteristics.
The SDRF-Proteomics aims to capture the sample metadata and its relationship with the data files (e.g., raw files from mass spectrometers). The SDRF-Proteomics does not aim to capture the downstream analysis including details of which samples were compared to which other samples, or how samples are combined into study variables or parameters for the downstream analysis such as FDR or p-values thresholds.
Status
(updated 2024-02-08)
SDRF Proteomics has been released as an official PSI specification on 24 May 2023. The specification can be found on the bigbio/proteomics-sample-metadata GitHub repository:
How to cite
Tools using SDRF
– SDRF official validator: Github, (Python)
– SDRF Java validator: Github, (Java)
– lesSDRF: Web annotator of SDRF files (App, GitHub)
– quantms: Proteomics workflow using SDRF as input (Github)
Related Publications
Perez-Riverol Y; European Bioinformatics Community for Mass Spectrometry. Toward a Sample Metadata Standard in Public Proteomics Repositories. J Proteome Res. 2020 Oct 2;19(10):3906-3909. doi: 10.1021/acs.jproteome.0c00376. Epub 2020 Sep 22. PMID: 32786688; PMCID: PMC7116434.
Claeys T, Van Den Bossche T, Perez-Riverol Y, Gevaert K, Vizcaíno JA, Martens L. lesSDRF is more: maximizing the value of proteomics data through streamlined metadata annotation. Nat Commun. 2023 Oct 24;14(1):6743. doi: 10.1038/s41467-023-42543-5. PMID: 37875519; PMCID: PMC10598006.