Unsupervised Machine Learning Helps Discover Patterns of Racial Disparities in Breast Cancer Patients


African American (AA) women tend to experience far worse breast cancer outcomes compared to White (W) women. While the overall incidence of breast cancer is lower in AA women, the mortality rate is significantly higher. This disparity is believed to be due to a combination of genetic and epigenetic differences in the tumor and its microenvironment that contribute to more aggressive disease phenotypes, as well as socioeconomic factors, which together yield poorer prognoses for AA women. Not only are there significant differences in mortality rates due to biologic factors, but there are also differences in treatment experiences and adverse effects (AEs) associated with breast cancer therapies that also contribute to decreased survival. Previous studies on adverse events are limited, suffer from small sample sizes and lack of systematic approaches, resulting in failure to detect significant differences. Additionally, the impact of comorbidities, prior therapies, and other patient-specific variables on race treatment-related adverse events have not been adequately studied. To this end, in a new study published in the journal Biomedicines, led by Nabil Adam who is the co-founder & CEO of Phalcon, LLC and Professor Emeritus at Rutgers University and Robert Wieder who is Professor of Medicine at the New Jersey Medical School, and the Cancer Institute of New Jersey, Rutgers University, the investigators applied temporal association rule (TAR) mining to uncover race-based patterns in the association of specific AEs with breast cancer treatments. They used the Surveillance, Epidemiology, and End Results (SEER)–Medicare dataset, which is a comprehensive source of longitudinal data that links cancer incidence records from the National Cancer Institute’s SEER program with Medicare claims data. These data have detailed information on cancer diagnoses, treatments, and outcomes for patients aged 65 and older. In their investigations, the authors used inclusion criteria of women who have been diagnosed with breast cancer stages I-IV with no history of other malignancies by National Cancer Institute clinical trials standards, to ensure a study population that is representative of the general Medicare patient population for older adults.

To uncover the associations between treatments and adverse events, the investigators applied TAR mining using the FPGrowth algorithm, which allowed them to analyze temporal progression of treatments and the resulting adverse events. The FPGrowth algorithm can handle large and complex datasets and efficiently generates frequent pattern trees without the need for candidate generation. They categorized treatments into 46 comprehensive mechanistic categories, including chemotherapy, biotherapy, and hormone therapy drugs, and consolidated adverse events from ICD-9 codes into 18 categories, which facilitated a detailed analysis of the temporal associations between treatments and adverse events. The authors’ analysis showed significant race-based differences in the associations between specific treatments and adverse events. The administration of chemotherapy, biotherapy, and immunotherapy drugs showed different adverse events in AA patients compared to W patients.

Professors Adam and Wieder found that the venue of care played a crucial role in the type and frequency of adverse events. The authors demonstrated that specific treatment categories, such as Her2 antibodies, bisphosphonates, and pyrimidine analogs, were associated with different adverse events in AA and W patients in different treatment venues. For example, Her2 antibodies were more likely to be associated with anemia and neutropenia in AA patients in institutional settings, while in W patients they were linked to nausea and respiratory symptoms. In addition, AA patients treated in institutional outpatient settings had higher incidences of severe adverse events of pulmonary embolism and severe neutropenia compared to those treated in private practice settings. In contrast, W patients showed a more uniform distribution of adverse events across different care venues which meant that the healthcare setting impacted AA patients more profoundly.

The researchers stratified the data by cancer stage into early-stage (I-III) and late-stage (IV) categories. They found that early-stage (I-III) AA patients experienced higher rates of adverse events such as severe neutropenia and thrombophilia when treated with taxanes and anthracyclines compared to their W counterparts while for the late-stage (IV) patients, they found even more pronounced disparities, with AA patients having much higher rates of adverse events (severe anemia and respiratory complications). Furthermore, the authors compared the predicted treatment/adverse events associations with actual clinical data to validate the TAR mining approach and showed that there was a high degree of overlap between the predicted and actual treatment/AE associations, which confirmed the accuracy and relevance of the mined rules. For instance, the predicted associations for taxanes and Her2 antibodies matched the actual observed adverse events (nausea, neutropenia, and respiratory symptoms). In conclusion, the research work of Professors Nabil Adam and Robert Wieder uncovered temporally relevant patterns of treatment-related adverse events that were previously difficult to detect. Their use of TAR mining identified specific treatment-adverse event associations that vary by race, stage of disease, and venue of care. These findings will be of high value to clinicians who can use the authors’ data to better stratify patients based on their risk of severe adverse events. For example, the knowledge that AA patients are more likely to experience severe neutropenia with certain chemotherapies allows for closer proactive monitoring and management of these patients. Moreover, oncologists can use the findings to better communicate with patients about the potential risks associated with their treatment plans which can help patients be more informed about their care. Additionally, the study provides evidence that can be used to advocate for changes in clinical practice guidelines and healthcare policies to address racial disparities in breast cancer treatment such as recommendations for more intensive monitoring of AA patients or adjustments to standard treatment protocols based on patient demographics and therefore ensure that high-risk populations receive the support and intervention they need to manage adverse events.

About the author

Dr. Nabil  Adam, Co-founder & CEO, Phalcon, LLC and Professor Emeritus, Rutgers University. He has extensive experience in healthcare administrative databases and other data repositories, including the NCI Surveillance, Epidemiology and End Results (SEER) program, the SEER-Medicare database, the Medicaid Analytic eXtract Dataset, and the Nationwide Inpatient Sample of the Healthcare Cost and Utilization Project. He led a technology team that designed and deployed an innovative industrial-strength knowledge management product, the Universal Integrator™, which targeted the pharmaceutical and healthcare industry and specialized in integrating, synthesizing, and analyzing knowledge across distributed heterogeneous information sources. In 2008, a major data provider acquired the product suite. Dr. Adam led a team that was in the top 25 innovators out of over 300 applicants to advance to stage 1 of the Centers for Medicare & Medicare Services (CMS) 2019 “CMS Artificial Intelligence Health Outcomes Challenge.”  As per the 2020 Stanford University report, Dr. Adam ranked among the top 2% of scholars worldwide regarding their impact in their field (AI and Image processing). His research has been supported by over $23 million in grants/contracts from several federal and state agencies as well as private organizations.

About the author

Dr. Robert Wieder, Professor of Medicine. He is a Medical Oncologist with 29 years of experience in practice and clinical trials and a noted investigator. Dr. Wieder trained at the NIH and Memorial Sloan Kettering in gene therapy and cancer signaling. As faculty at New Jersey Medical School, he conducted basic and translational investigations in breast cancer dormancy, the roles of retinoids and deltanoids in cancer therapy and was the principal investigator of a Minority-Based Community Clinical Oncology Program. Dr. Wieder served on the NCI Breast Cancer Steering Committee.

The two investigators have been collaborating on predicting outcomes of underserved patients with breast cancer and received support for using deep learning to predict adverse eventsa and outcomes from cancer therapy.


Adam N, Wieder R. Temporal Association Rule Mining: Race-Based Patterns of Treatment-Adverse Events in Breast Cancer Patients Using SEER-Medicare Dataset. Biomedicines. 2024 May 29;12(6):1213. doi: 10.3390/biomedicines12061213.

Go To Biomedicines.