Skip to content

AI-readiness

PSI-AI: Advancing AI-readiness of public proteomics data

Mission statement
The PSI AI readiness working group aims to bridge the gap between artificial intelligence and public proteomics data. We focus on two complementary goals. The first is to make public proteomics data suitable for AI applications. The second is to support and standardize the use of AI in workflows that generate new data. We work toward these goals through the use of community standards and reproducible and accessible pipelines. Our aim is to integrate with, not duplicate, existing infrastructure and community efforts.

Consult the working group charter here

Goals

1. Improve Metadata Accessibility : We support the active and retro-active annotation of public proteomics datasets to enhance their discoverability and reusability in AI contexts. This includes efforts to enrich metadata using community pipelines, align annotations with existing standards, and provide a centralized and indexable location for these annotations.

2. Harmonize Raw Data Formats: We strive to convert diverse vendor-specific raw formats into open, standardized formats that support long-term usability and computational workflows. A focus is placed on creating curated resources in mzML and mzPeak formats, anchored in existing repositories, and providing these in a centralized and curated manner to the community for easy reuse.

3. Enable Low-Barrier Reprocessing: We promote the development and dissemination of lightweight, community-friendly workflows that support the transformation of raw proteomics data into machine learning-ready formats. 

4. Establish Benchmark Resources: We are building curated benchmark corpora that include diverse proteomics datasets processed in standardized ways. These resources are designed to support method development, comparison, and evaluation in AI applications.

5. Support Good Practices in Reuse: We aim to document and share best practices for the ethical and effective reuse of public datasets in AI workflows. This includes fostering discussions around reprocessing standards and promoting transparency and reproducibility.

6. Contribute to AI Standardization in Proteomics: We collaborate on the development of frameworks for benchmarking AI models and on the integration of existing PSI standards into AI pipelines. Community engagement through surveys, training events, and collaborative initiatives is a key priority.


Group structure (2025)

RoleName
ChairTine Claeys, CompOmics, VIB Center for Medical Biotechnology
Co-chairsSamuel Wein, University of Tübingen – OpenMS inc
Ralf Gabriels, CompOmics, VIB Center for Medical Biotechnology
Secretary-currently vacant-

Other active working group members

Getting involved

Fill in the following google form for the kick-off meeting