Case Studies for Overcoming Challenges in Using Big Data in Cancer
- PMID: 36625851
- PMCID: PMC10102839
- DOI: 10.1158/0008-5472.CAN-22-1277
Case Studies for Overcoming Challenges in Using Big Data in Cancer
Abstract
The analysis of big healthcare data has enormous potential as a tool for advancing oncology drug development and patient treatment, particularly in the context of precision medicine. However, there are challenges in organizing, sharing, integrating, and making these data readily accessible to the research community. This review presents five case studies illustrating various successful approaches to addressing such challenges. These efforts are CancerLinQ, the American Association for Cancer Research Project GENIE, Project Data Sphere, the National Cancer Institute Genomic Data Commons, and the Veterans Health Administration Clinical Data Initiative. Critical factors in the development of these systems include attention to the use of robust pipelines for data aggregation, common data models, data deidentification to enable multiple uses, integration of data collection into physician workflows, terminology standardization and attention to interoperability, extensive quality assurance and quality control activity, incorporation of multiple data types, and understanding how data resources can be best applied. By describing some of the emerging resources, we hope to inspire consideration of the secondary use of such data at the earliest possible step to ensure the proper sharing of data in order to generate insights that advance the understanding and the treatment of cancer.
©2023 The Authors; Published by the American Association for Cancer Research.
Similar articles
-
Challenges to Using Big Data in Cancer.Cancer Res. 2023 Apr 14;83(8):1175-1182. doi: 10.1158/0008-5472.CAN-22-1274. Cancer Res. 2023. PMID: 36625843 Free PMC article. Review.
-
A Scalable Quality Assurance Process for Curating Oncology Electronic Health Records: The Project GENIE Biopharma Collaborative Approach.JCO Clin Cancer Inform. 2022 Feb;6:e2100105. doi: 10.1200/CCI.21.00105. JCO Clin Cancer Inform. 2022. PMID: 35192403 Free PMC article.
-
The future of Cochrane Neonatal.Early Hum Dev. 2020 Nov;150:105191. doi: 10.1016/j.earlhumdev.2020.105191. Epub 2020 Sep 12. Early Hum Dev. 2020. PMID: 33036834
-
Using big data in pediatric oncology: Current applications and future directions.Semin Oncol. 2020 Feb;47(1):56-64. doi: 10.1053/j.seminoncol.2020.02.006. Epub 2020 Feb 29. Semin Oncol. 2020. PMID: 32229032 Free PMC article. Review.
-
The project data sphere initiative: accelerating cancer research by sharing data.Oncologist. 2015 May;20(5):464-e20. doi: 10.1634/theoncologist.2014-0431. Epub 2015 Apr 15. Oncologist. 2015. PMID: 25876994 Free PMC article.
Cited by
-
Unlocking the Power of Benchmarking: Real-World-Time Data Analysis for Enhanced Sarcoma Patient Outcomes.Cancers (Basel). 2023 Sep 2;15(17):4395. doi: 10.3390/cancers15174395. Cancers (Basel). 2023. PMID: 37686671 Free PMC article.
References
-
- Mangravite LM, Sen A, Wilbanks JT, Sage Bionetworks Governance Team. Mechanisms to govern responsible sharing of open data: a progress report. 2020. Seattle, WA: Sage Bionetworks. Available athttps://sage-bionetworks.github.io/governanceGreenPaper/manuscript.pdf.
-
- European Medicines Agency (EMA). Draft guideline on registry-based studies. EMA/502388/2020. 2020.
-
- Schilsky RL, Michels DL, Kearbey AH, Yu PP, Hudis CA. Building a rapid learning health care system for oncology: the regulatory framework of CancerLinQ. J Clin Oncol 2014;32:2373–9. - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources