The 25.4 Data Release includes data from the following studies:
Human data
A Simulated Data Set of Extreme Longevity
This study features a synthetic dataset of 19,859 participants, generated by Syntegra Inc. in 2022 using a transformer-based language model. The simulation was based on a real dataset of 4,203 individuals from the New England Centenarian Study (NECS) and includes demographic, clinical, physical, genetic, proteomic, and metabolomic information.
- This release provides synthetic genomic variant data, metabolomic data, proteomic data, and clinical data for 13,901 participants. This dataset represents 70% of the total synthetic dataset generated by Syntegra. Individuals with the most extreme ages (greater than 109 years) are also excluded from this dataset.
Human Serum Determinants of Aging
This study features an investigation of human longevity using metabolomics and proteomics, data collected through a case-cohort design. Longevity cases were defined as individuals who lived longer than 98% of their peers, based on U.S. life expectancy data. Participants came from four long-term studies of older adults—MrOS, SOF, Health ABC, and CHS. A random sample of participants was selected for comparison, including some who also reached exceptional ages. Data from the CHS study are hosted on dbGaP. Data from the MrOS, SOF, and Health ABC cohorts are hosted on the ELITE Portal.
- This release provides metabolomics data generated from stored blood samples using three platforms—primary metabolism, complex lipids, and biogenic amines—by the UC Davis lab led by Dr. Oliver Fiehn for the MrOS and SOF cohorts.
- Health ABC cohort data and proteomics data will be available in future data releases.
Non-Human data
This study indexes data from the PRIDE data repository, drawing on an investigation into how aging affects molecular regulation in mice. It employs a systems-level approach to examine changes across multiple lifespan-extending treatments.
- This release provides high-resolution proteomics data from mouse liver samples, generated using liquid chromatography and Orbitrap mass spectrometry after treatment with various longevity-promoting drugs.
This study indexes data from the PRIDE data repository, featuring a comparison of proteomics technologies to improve data quality and reliability in protein analysis.
- This release presents data demonstrating the enhanced performance of the new VIP-HESI ion source on the Bruker timsTOF mass spectrometer, showcasing greater sensitivity and reproducibility compared to traditional nanospray methods across various sample types.