Introduction
There were approximately 2.1 million new cases of incident female breast cancer in 2018 globally, accounting for 25% of cancer cases in women [
1]. Long-term survival has improved for women over the past 30 years as advances in cancer therapy have resulted in reduced cancer-specific mortality. Consequently, mortality from other causes has become more important [
2], with cardiovascular disease (CVD) being the leading cause of death in older women who survive breast cancer [
3]. This is partly due to the effect of cytotoxic chemotherapies and radiotherapy which are associated with an increase in cardiovascular morbidity and mortality [
4‐
7]. In this paper, we focus on coronary artery disease (CAD), the most common type of CVD. Particularly for long-term survivors at higher CAD risk due to risk factors unrelated to their cancer and cancer therapy, adverse effects of therapy are likely to accumulate and thus become relatively more important. It seems likely that risk factors associated with CAD in the general population will also be associated with CAD in cancer survivors, but empirical evidence is needed, particularly in those treated with chemotherapy or radiotherapy. Multiple lifestyle and environmental risk factors have well-established CAD associations including smoking, body mass index (BMI), total cholesterol, type 1 and type 2 diabetes, and hypertension [
8].
Inherited genetic variation is also known to affect risk: genome-wide association studies (GWAS) have identified many common genetic variants associated with CAD, and polygenic risk scores (PRS) have been shown to provide useful CAD risk discrimination [
9‐
11]. Polygenic risk scores are an aggregation of genomic variant information and GWAS-derived weights reflecting magnitude of association for a condition of interest [
12]. The motivation behind using a PRS is based on the common variant-common disease hypothesis, where much of the genetic risk for common adult-onset diseases can be attributed to the cumulative effect of many common variants with small effect sizes rather than rare variants with large effect sizes [
13]. Within research assessing clinical utility of polygenic risk scores, most of the evidence appears to come from the study of CAD. While the consensus for the clinical utility of CAD PRS is still unclear [
14‐
17], CAD PRS is potentially poised to add accuracy to clinical risk predictions, define populations who would most benefit from statin prescriptions, and estimate lifetime risk trajectories [
18]. It still remains an open question as to whether existing CAD PRS can be as predictive in non-European populations, but there has been some research that has sought to validate existing PRS in a cohort of South Asian participants [
19]. There are currently no studies quantifying the performance of CAD PRS for risk prediction in breast cancer survivors and, furthermore, whether polygenic risk scores interact with oncotherapy for breast cancer. Polygenic risk scores in combination with other risk factors may be useful in identifying women with breast cancer in whom the adverse effects of treatment may outweigh the benefits. The aim of this study was to evaluate the association of a published coronary artery disease polygenic risk score [
10] and incident CAD outcomes in a cohort of women with breast cancer.
Methods
Study cohort
The Studies in Epidemiology and Research in Cancer Heredity (SEARCH) cohort is a population-based prospective study based in the Eastern Region of England, which was served by the East Anglian Cancer registry until 2002 and the Eastern Region Cancer Intelligence Unit from 2002 to 2016. Recruitment of patients was conducted from June 1996 to December 2016. Incident breast cancer cases were all cases diagnosed under the age of 70 years from July 1996 to December 2016. Patients completed a self-administered questionnaire upon recruitment, which included questions about personal information, reproductive history, and other medical history. Tumor characteristics were obtained from the national cancer registry. Follow-up was ascertained through death registration with the most recent update provided by Public Health England on May 31st, 2020. This provides the causes of death recorded on parts 1 and 2 of the death certificate. The SEARCH dataset was restricted to female breast cancer cases who had complete genotype information (n = 12413) for this study. The final analytic sample contained 8946 participants after removing those of non-European ancestries (n = 15) and those who experienced an event before diagnosis (n = 3452).
Linkage of the SEARCH cohort to hospital episodes statistics (HES) data was used to identify incident CAD events. HES data comprises a record for each finished consultant episode (FCE), which is a period of care for a patient under a single consultant at a single hospital [
20]. Diagnoses coded for each FCE include all diagnoses noted in the clinical record. Variables of interest included the time (years) between diagnosis and hospital admission and the ICD-10 diagnosis code. The recorded episode time, admission time, or operation time elapsed since diagnosis with breast cancer in HES was considered the time of the event. For individuals with multiple records in which CAD was one of the clinical diagnoses, the earliest time to event was used as the analytical time to event. Prevalent disease at baseline was defined as an event occurring before diagnosis (encoded as negative time) and these times were excluded from the analysis.
Genotype data
A total of 12413 individuals from the SEARCH cohort were genotyped in two batches: batch I was genotyped on the Illumina Infinium iCOGS array (
n = 8404) and batch II on the Illumina Infinium OncoArray (
n = 4009). Both chips provide genome-wide coverage of common variants with 211115 SNPs on the iCOGS array [
21] and 533631 SNPs on the OncoArray [
22]. Genotyping QC was performed as previously described [
21,
22]. Genotypes were then phased using SHAPEIT and imputed into the 1000 Genomes Project reference panel (version 3) using IMPUTE version 2 for iCOGS and OncoArray.
Calculating PRS and quality control
The polygenic risk score (PRS) used in this study was derived by Inouye et al. and is called metaGRS (henceforth referred to as PRS), which consists of approximately 1745180 variants (a detailed description of its derivation can be found in their Additional file) [
10]. The set of SNPs and their corresponding weights for PRS were taken from the Polygenic Score Catalogue, which is an open database of published polygenic risk scores [
23]. The PRS was calculated as a weighted sum of all the effect alleles carried using the imputed allele dosages and the published SNP effect sizes (log relative risk). Scores for each sample individual were generated using Plink 2.0 software [
24]. SNPs with imputation quality scores of less than 0.3 and ambiguous strand SNPs (A/T and G/C pairs) were excluded. Multi-allelic SNPs with only two common alleles were treated as bi-allelic. All scores were standardized to zero-mean and unit variance.
CAD events
Incident coronary artery disease events were defined as a composite endpoint of unstable angina, myocardial infarction, or death due to complications following myocardial infarction according to the
International Statistical Classification of Diseases and Related Health Problems 10th Revision (ICD-10) (Additional file
1: Table S1). This composite endpoint was chosen to maximize the number of incident cases, and no differential effects were observed between predictor variables and different definitions of incident CAD events (Additional file
1: Table S2).
Statistical analyses
All statistical analysis was performed using R 4.0.0 [
25]. We investigated the association between PRS on the composite primary endpoint of the first incident coronary event using cause-specific Cox proportional hazards regression. We identified the presence of competing risks of non-CAD death (Additional file
1: Figure S1) and thus performed Cox regression treating competing events as censored [
26]. Along the same vein, cumulative incidence curves are presented instead of Kaplan–Meier curves because Kaplan–Meier curves are known to represent upward-biased incidence estimates in the presence of competing risks [
26]. Time zero was date of diagnosis with patients entering the at-risk cohort at date of study enrolment (left truncation). Participants were right censored on the date of first occurrence of a CAD event, death from a cause other than coronary artery disease or last follow-up. Schoenfeld residuals for variables used in modelling and time were assessed for any significant departure from the proportional hazards assumption using the “
cox.zph” function in the
survival package [
27]. A Wald test was performed to assess whether failure events were independent of left truncation [
28]. Regression models were sequentially adjusted, first using only continuous PRS as the main exposure variable adjusted for age at diagnosis (years, continuous), genotype assay (Oncoarray, iCOGs) and eight genetic principal components (PCs), and then including sequential adjustments for conventional risk factors: BMI (kg/m [2], continuous), smoking status (never, past, current); sociodemographic variables: drinking status (past, current), education level (below GSCE, GSCE, A-level, graduate), index of multiple deprivation (IMD) (continuous); medical variables: age at menarche (years, continuous), thyroid disease (binary), parity (ordinal), hormone replacement therapy (binary); and oncotherapy variables: chemotherapy (binary), radiotherapy (binary), and hormone therapy (binary). Note that we did not have available data on baseline measurement for blood pressure, cholesterol, lipid-lowering medications, diabetes, or familial history.
The models were fit to the same subsample of cohort participants with increasingly more complete covariates to allow for more consistent comparison of the impact of adjustments and reduce the potential for selection bias in the scenario of outset restriction to participants with the most complete information on adjustment covariates.
We additionally assessed possible variation of the association of PRS with CAD according to smoking status and BMI level based on interaction tests. We also assessed the incremental improvement in CAD risk prediction from the addition of PRS to models including combinations of age, BMI, smoking, and other baseline covariates.
We calculated the net reclassification improvement and incremental discrimination index using the ncirens package to explore the potential clinical utility of PRS in women with breast cancer. More details about these calculations can be found in the Additional file. All confidence intervals are shown at the 95% level. All p values are 2-tailed.
Discussion
Based on a large cohort of British women with breast cancer, we have provided evidence that a CAD polygenic risk score developed for the general population can be generalized to breast cancer patients, with an estimated 33% higher CAD risk per 1 SD higher PRS (HR = 1.33, 95% CI 1.20, 1.47), independent of established cardiovascular risk factors (age, smoking, BMI), oncotherapies and other variables associated with cardiovascular risk (education level) in a cohort of British women with breast cancer. Our results support previous evidence that the association of PRS and CAD risk may operate through molecular pathways that do not overlap with those of traditional risk factors such as smoking and BMI. This is consistent with the original PRS analysis which found only a modest attenuation for PRS when adjusting for BMI, smoking status, as well as diabetes, hypertension, family history of heart disease, and cholesterol levels (HR: 1.58 per SD; 95% CI 1.55–1.61 unadjusted; HR: 1.48 per SD; 95% CI 1.45–1.51 adjusted) [
10]. Several other studies also found only modest attenuation of CAD polygenic risk scores when adjusting for variables such as lipid treatment at baseline, cholesterol, and systolic blood pressure [
9,
19,
29].
However, we note that there was an almost significant interactive effect between being a past smoker and the PRS. Since PRS is known to be correlated to certain conventional CAD risk factors [
10,
30], it is plausible that some fraction of incident CAD risk explained by PRS may be dependent on smoking status. While this paper is not expressly predictive in nature, we note that the addition of PRS may not have provided additional risk discrimination on top of BMI and smoking because by middle age, the genes that compose this risk score may have already exerted their influence, and thus the PRS would not be expected to add discriminatory ability.
The PRS improves risk discrimination in breast cancer survivors. For instance, we found an over twofold HR for CAD in a comparison of individuals in the top versus bottom one-fifth of the risk score distribution. Furthermore, when considering 10-year risk of incident CAD following breast cancer diagnosis, we found that 5.6% of lower risk participants who did not have a recorded CAD event were reclassified to a higher risk group with the addition of PRS to the baseline model (Additional file
1: Table S5). While the change is discrimination is small, this may result in meaningful risk reclassification in clinical decision-making between the harms and benefits of chemotherapy. Further work is required to evaluate whether such reclassification would justify the additional cost of genotyping.
We acknowledge the limitation of how treatment (chemotherapy, radiotherapy, and anti-hormone therapy) was coded as dichotomous variable (whether or not a patient received treatment). The loss of information about other treatment aspects (e.g. dose, duration, type) may have contributed to measurement error that resulted in the associations reported in our paper. Furthermore, the association of the interaction of PRS and chemotherapy is likely explained by selection bias, where healthy patients are more likely to undergo chemotherapy. More granular data will be required to further assess these associations.
The role that PRS may play in breast cancer clinical care is currently unclear, but fundamentally, PRS may be used to help estimate the lifetime risk of cardiovascular disease in a breast cancer survivor. This may have two clinically useful benefits: (1) facilitate earlier detection of cardiovascular risk in breast cancer survivors to help them more effectively manage cardiovascular risk factors earlier to reduce future cardiovascular risk and (2) aid in treatment decision-making when considering the negative cardiotoxic effects of their treatment regiments.
This is especially important in breast cancer patients who face the unique challenge of needing to maximize gains from cancer treatment while also minimizing its cardiotoxic effects. More women are surviving breast cancer with an increase in 5-year survival for early stage breast cancer from 79% in 1990 to 88% in 2012 [
31] (there were an estimated 3.4 million breast cancer survivors in the US in 2015 [
32]), so cardiovascular mortality may become an increasingly important concern. Bradshaw et al. showed that there is nearly a twofold increase in the incidence of CVD for long-term breast cancer survivors around 7 years after diagnosis [
4]. Several large randomized trials have provided evidence of the association between chemotherapy, radiotherapy, hormone therapy and increased risk of cardiovascular events [
33,
34]. For instance, Darby et al. showed that the rate of CAD was proportional to the average dose of ionizing radiation during radiotherapy for breast cancer, with increases in rate continuing as long as 20 years post-exposure [
35]. This is particularly important for women diagnosed at a relatively young age who begin treatment and may then have increased risk of CVD mortality. It suggests that breast cancer survivors may benefit from a PRS assessment and should be closely monitored for development of cardiovascular risk factors following diagnosis and subsequent treatment. In our study, cumulative incidence curves of incident CAD events did not appear to be substantially different when stratified by oncotherapy status (Additional file
1: Figure S3), which suggests that more granular data on treatment data, such as dosage, frequency, or duration, is needed to better assess the interplay between drug cardiotoxicity and genetic cardiovascular susceptibility. PRS may help clinicians and their patients make decisions about whether the benefits of adjuvant chemotherapy and other oncotherapies outweigh the risks.
Limitations
There are some limitations in interpreting the current findings. The association between PRS and incident CAD could not be adjusted for important risk factors such as diabetes, hyperlipidaemia, family history of cardiovascular disease, and hypertension because these data were not collected. It is worth noting; however, that the PRS used in this study has been shown in other cohorts to provide additional predictive benefit over standard cardiovascular risk prediction algorithms such as the Framingham risk score, which include such metabolic risk factors [
9]. Treatment data were limited to whether the patient had received chemotherapy, radiotherapy, and anti-hormone therapy. Data on specific drugs or doses received were not available. Genotype data were available for predominantly participants of white European ancestry, which suggests the need for studies in people of other ancestries to maximize generalizability. Furthermore, the observational nature of these data limits any inference that might be drawn relating to the association between therapy and outcome.
Acknowledgements
We thank all the study participants who contributed to this study and all the researchers, clinicians and technical and administrative staff who have made possible this work. This work was supported by core funding from the: UK Medical Research Council (MR/L003120/1), British Heart Foundation (RG/13/13/30194; RG/18/13/33946) and NIHR Cambridge Biomedical Research Centre (BRC-1215-20014) [*]. This work was also supported by Health Data Research UK, which is funded by the UK Medical Research Council, Engineering and Physical Sciences Research Council, Economic and Social Research Council, Department of Health and Social Care (England), Chief Scientist Office of the Scottish Government Health and Social Care Directorates, Health and Social Care Research and Development Division (Welsh Government), Public Health Agency (Northern Ireland), British Heart Foundation and Wellcome. *The views expressed are those of the author(s) and not necessarily those of the NIHR or the Department of Health and Social Care.
Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit
http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (
http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.