Background:
Currently, the researchers’ ability to access diverse datasets, and perform robust and reproducible analyses is stifled by the siloed nature of the current informatics infrastructure. As a part of the NCI’s cancer research data ecosystem, the aim of PDC is to overcome this challenge by making it possible for any researcher to ask new and fundamental questions about cancer and provide much-needed tools to accelerate research and the development of personalized treatments for individual patients. Cancer researchers can now easily access the multi-omics (proteomic, genomic, imaging, etc.) data from many sources across the CRDC’s virtual, expandable infrastructure, thus lowering the entry barrier for anyone who wants to get involved in integrative research.
Our Engagement:
- Provide a secure, accessible, cloud-hosted environment
- Scalable to accommodate large volumes of data.
- Compute power for data analysis
- Interoperate with other NCI analytical platforms
- Build an extensible data model and harmonization standards
Outcome:
The PDC is a next-generation proteomic data repository that aims to make biomedical data sets accessible and connected at an unprecedented scale to facilitate creative new ways to combine, analyze, and ask questions that drive precision medicine. Hosted on the Amazon Web Services (AWS) cloud platform, the PDC provides access to highly curated and standardized biospecimen, clinical, and proteomic data through an intuitive interface to filter, query, search, visualize and download the data and metadata. In addition, there are robust APIs available for bioinformaticians to access the data programmatically for analysis with cloud-native and cloud-agnostic applications alike. What this means is that PDC users can access the data from anywhere in the world to obtain the data they need quickly, easily, and securely.
Other Projects: