Global Health: AI Systems for Hyper-personalized Preventative Healthcare
By Angelina Yieu, GRC 2024 Global Essay Competition Top 10
Chronic diseases, influenced by genetic, physiological, environmental, and behavioral factors, cause 41 million (74%) of global deaths annually, including cardiovascular diseases (CVDs), cancers, respiratory diseases, and diabetes. Nearly 80% of these are avoidable, yet 63% of deaths result from unhealthy lifestyle behaviors. In just the US, $55 billion USD is lost every year due to the lack of prevention efforts, evident with less than 8% of Americans participating in checkups (Benedette Cuffari, M.Sc., 2024).
Given the variety of body types, it’s vital for individuals to understand their state of health and unique biological makeup. For example, medication effectiveness can vary due to genomic variants, with over 60 identified gene variants affecting 190+ drugs (FDA, 2024). A study involving the gene sequencing of 300 deceased participants in a precision medicine project revealed that 93% had at least one of 17 identified gene variants, with 80% prescribed medications influenced by these variants — highlighting the importance of personalized care (National Center for Advancing Translational Sciences, 2024).
Personalized care has the power to significantly enhance individuals’ quality of life, while increasing accessibility to affordable, higher-quality healthcare. However, achieving this requires complementary solutions. By combining artificial intelligence with a new, preventative care-specialized workforce, we can discover life-saving connections between various datasets.
Solution #1: Machine Learning Algorithms To Improve Preventative Practices
The solution begins with training algorithms on a large dataset of anonymous medical profiles to identify niche patterns among lifestyles, habits, environmental exposure, medical history, and biological composition. This data generates predictions and simulations for health outcomes, which specialists can use to develop risk assessments and hyper-personalized prevention plans. AI updates these plans based on real-time data, providing refined recommendations when high-risk habits that could interact with genetic factors are detected.
Solution #2: Data Collection using epigenetic data and wearables for a credit system
By using epigenetic data, we can analyze how age, environment, and behavior chemically alter gene function, such as DNA methylation. Unlike static genetic data, epigenetic changes can turn genes “on” or “off” based on external lifestyle factors (Sonrai Analytics, 2024). Wearables like smartwatches can collect lifestyle data, including sleep patterns, physical activity, heart rate, and air quality. Daily questionnaires can further analyze stress, nutrition, and niche habits for clustering. Points could be exchanged for meeting personalized health milestones, like step goals (e.g., 50,000 steps weekly) or diet adherence, for rewards like insurance discounts, free lifestyle perks (i.e., movie tickets) and more; acting as a motivator system for healthy behaviors. Devices can additionally send reminders for checkups and warn users about risks tied to specific habits, for example, "Continuing this habit for one week could increase your risk of [outcome] by [percentage] and impact [health/lifestyle aspect]." Digital medical records will be available for users, allowing for detailed understanding and control over their records.
Solution #3: Increase Healthcare Workforce, Lower Labor Costs from hours and promote better work balance through shortened specialized medical education
To optimize the system’s potential, universities can introduce a Preventative Healthcare And Genetics medical program that is a 5-year bachelor's degree studying risk prediction, diagnosis, and personalized health plans through genomic, disease, and public health analysis. Practical training includes case study analysis to support and regulate AI predictions, with additional units on behavioral psychology, data science, and biotechnology. After this, graduates could either enter real-world training for developing prevention plans with 1–2 years of shadowing or pursue a master’s for advanced roles. Currently, becoming a preventative medicine physician requires approximately 9 years of education and 2–3 years of residency, totaling 12–14 years and over $400,000 USD in costs (Base Camp, 2024). This exacerbates the healthcare labor shortage, especially in preventative healthcare.
To address the projected deficit of 10 million healthcare workers by 2030 and the skyrocketing labor costs used for recruitment, shortened education encourages more career entry and workflow balance. Specialist’s stronger focus on prevention, diagnosis and check-up regulation helps support qualified doctors by reducing time spent on administrative and chronic diagnostic workload — allowing them to focus more on quality treatment. Higher studies in prevention care could also heighten its importance among citizens and governments, potentially leading to subsidies and more funding for research. Specialization could also reduce diagnostic errors, which result in 800,000 deaths or cases of permanent disability in the US annually due to omissions or lack of patient history (CNN, Deidre Mac Phillips, 2023).
Genomes & Processing Datasets
The initial data collection for foundational specialist-supported risk diagnoses should include genomic (DNA and RNA sequencing) and epigenetic data (through saliva or blood). DNA should be examined for genetic variants, gene mutations, inherited conditions, and disease risk factors (e.g., BRCA1 gene for breast cancer). Epigenetic data can be analyzed using high-throughput methylation profiling technologies like chips and sequencing tools (Henriques et al., 2020). Data reassessment every 1–2 years ensures the effectiveness of current prevention strategies and refines new ones.
When processing datasets, non-random missing values should be replaced by binary indicators to highlight missing patterns essential for correlation discovery (Jakobsen et al., 2017). A k-means clustering algorithm can group data points based on similarities before further processing through XGBoost, a gradient tree boosting classification model optimized for ranking problems. It corrects faults, trains on residual errors, and identifies important genes or features for prediction analysis; providing fast, interpretable data for clinicians, making it ideal for analyzing large medical datasets and generating potential outcomes for individuals based on various lifestyle pathways (Henriques et al., 2020).
Figure 1
Sample Case: Hypertension
Hypertension, known as a “silent killer,” is one of many asymptomatic CVD conditions that go unnoticed. It affects 1 billion people globally and is a leading cause of mortality. By analyzing epigenetic data, including DNA methylation changes, along with BMI, age, family history, and environmental factors, hyper-personalized preventative plans (eg. Dietary) can be developed.
Implementation Strategy:
The AI predictive system should be streamlined first, with genome testing support from consultant sessions, collaborating with hospitals or clinics for medium to high-income regions with high chronic disease rates. Community outreach should also be implemented, such as school talks and point of care services, which can be used to promote awareness of preventative care and its benefits. Then as the newly trained preventative specialists enter the field for internships or jobs, the program can expand globally to medium- and low-income countries, offering affordable sessions, and preventative plans through mobile hubs and telemedicine. By improving sustainability and accessibility of preventative care, this new system could smoothen hospital workflow, reduce healthcare costs from hours, alleviate workforce shortages, enhance quality of life and improve health outcomes for millions.
Bibliography
World Health Organization (WHO). 2023. “Noncommunicable Diseases.” World Health Organisation. September 16, 2023. https://www.who.int/news-room/fact-sheets/detail/noncommunicable-diseases.
“Study Suggests Most People May Have Gene Variants That Impact Drugs’ Effectiveness | National Center for Advancing Translational Sciences.” Nih.gov, 2024, ncats.nih.gov/news-events/news/gene-variants-that-impact-drugs-effectiveness.
“How to Become a Preventive Medicine Physician?” Career Basecamp, 2024, www.career-basecamp.com/careers/preventive-medicine-physician/education.
Torkington, Simon. “Here Are 3 Ways the World Can Improve Healthcare for All.” World Economic Forum, 16 Aug. 2024, www.weforum.org/stories/2024/08/3-ways-the-world-can-improve-healthcare-for-all/.
McPhillips, Deidre. “Diagnostic Errors Linked to Nearly 800,000 Deaths or Cases of Permanent Disability in US Each Year, Study Estimates.” CNN, 19 July 2023, www.cnn.com/2023/07/19/health/diagnosis-error-study/index.html.
Conlon, Eoghan. “What Is XGBoost? - and How Can It Be Used in Precision Medicine?” Sonrai Analytics, 12 Feb. 2024, sonraianalytics.com/what-is-xgboost/.
Zhao, Huanhuan, et al. “Predicting the Risk of Hypertension Based on Several Easy-To-Collect Risk Factors: A Machine Learning Method.” Frontiers in Public Health, vol. 9, 24 Sept. 2021, https://doi.org/10.3389/fpubh.2021.619429.
Benedette Cuffari, M.Sc. “Proactive Health: The Shift towards Preventative Healthcare.” News, July 31, 2024. https://www.news-medical.net/health/Proactive-Health-The-Shift-Towards-Preventative-Healt hcare.aspx.
Deloitte 2024 global healthcare outlook “addressing cost affordability” Document, 2024. https://www2.deloitte.com/content/dam/Deloitte/global/Documents/addressing-cost-affordabil ity.pdf.
Henriques, João, Filipe Caldeira, Tiago Cruz, and Paulo Simões. 2020. "Combining K-Means and XGBoost Models for Anomaly Detection Using Log Datasets" Electronics 9, no. 7: 1164. https://doi.org/10.3390/electronics9071164
Loftus, Tyler J., Benjamin Shickel, Jeremy A. Balch, Patrick J. Tighe, Kenneth L. Abbott, Brian Fazzone, Erik M. Anderson, et al. “Phenotype Clustering in Health Care: A Narrative Review for Clinicians.” Frontiers in Artificial Intelligence, vol. 5, 12 Aug. 2022, https://www.frontiersin.org/journals/artificialintelligence/articles/10.3389/frai.2022.842306/full .
CDC. “Epigenetics, Health, and Disease.” Genomics and Your Health, 2 Dec. 2024, www.cdc.gov/genomics-and-health/epigenetics/index.html.
American Hospital Association. “The Financial Stability of America’s Hospitals and Health Systems Is at Risk as the Costs of Caring Continue to Rise.” Www.aha.org, American Hospital Association, May 2024, www.aha.org/costsofcaring.