A nomogram for predicting the HER2 status in female patients with breast cancer in China: a nationwide, multicenter, 10-year epidemiological study

Background The concordance rate of human epidermal growth factor receptor 2 (HER2) status between core needle biopsy (CNB) and subsequent excisional biopsies of the same tumor varies from 81 to 96%, which may cause inappropriate neoadjuvant therapy that impair the potential benefit from HER2 targeted therapy for patients. This study aimed to establish a nomogram to predict the HER2 status pre-operatively as an auxiliary diagnosis to CNB assessment. Methods Among 4211 breast cancer patients cataloged in the Nation-wide Multicenter 10-year Retrospective Clinical Epidemiological Study of Breast Cancer in China, 2291 patients with complete relevant information were included in this study, which were further randomized 3:1 and divided into a training set and a validation set. The nomogram was established based on independent predictors of HER2 positivity recognized by logistic regression analysis and further validated internally and externally. Results The multivariate logistic regression analysis showed that T-stage, N-stage, estrogen receptor (ER) status, progesterone receptor (PR) status were independent predictors for HER2 status. The nomogram was thereby constructed by those independent predictors as well as histology type. The areas under the receiver operating characteristic curve (AUC) of the training set and the validation set were 0.636 and 0.681, respectively. The calibration plots demonstrated good fitness of the nomogram for HER2 status prediction. With the optimal cutoff value, the nomogram yielded 80.0% sensitivity, 43.1% specificity in the training set and 81.1% sensitivity, 49.8% specificity in the validation set. Conclusions The present nomogram can provide valuable information on HER2 status and combined with standard CNB assessment, clinicians could make more appropriate decision on neoadjuvant therapy of breast cancer. Electronic supplementary material The online version of this article (10.1186/s13000-019-0806-4) contains supplementary material, which is available to authorized users.


Background
Breast cancer has been reported to be the most common malignant tumor and the leading cause of cancer death among women worldwide [1]. The utilization of neoadjuvant therapy (preoperative systemic therapy) provide the opportunity to monitor response during treatment, and help to increase the curative intervention as well as the breast conservation rate by reducing tumor burden [2]. The roles of CNB have been well established as an important preoperative diagnostic method for breast lesions. CNB is less invasive than excision biopsy and provide more information than fine needle aspiration (FNA). In the case of neoadjuvant therapy of breast cancer, CNB is the gold standard for pathological diagnosis and molecular subtype assessment.
Emerging data show that neoadjuvant chemotherapy combined with HER2-targeted therapy yields a higher pathologic complete response (pCR) rate and favorable disease-free and overall survival when compared with neoadjuvant chemotherapy alone in women with HER2-positive breast cancers [3][4][5][6][7][8]. The increasing number of HER2 positive breast cancer patients being treated in the neoadjuvant setting gives rise to the need to accurately assess HER2 status on the CNB material that often is the only tissue available before treatment in these patients. However, due to intratumoral heterogeneity, the concordance rate of HER2 status between CNB and subsequent excisional biopsies of the same tumor varies from 81 to 96% as reported by different institutions [9][10][11][12].
In order to improve the accuracy of HER2 assessment in CNB specimens so that patients could get appropriate neoadjuvant therapy, we intend to identify possible predictors based on a nation-wide multicenter data spans 10 years in China and construct a nomogram for predicting HER2 status, which will increase the accuracy of HER2 assessment pre-operatively. In combination with CNB and nomogram prediction, it may provide more accurate information on whether the HER2 targeted therapy is needed in the neoadjuvant setting.

Study design and data collection
Data was obtained from the hospital-based, multicenter, 10-year (1999-2008), retrospective study of randomly selected pathology confirmed primary female breast cancer cases via medical chart review which was approved by the ethics committee of the Cancer Fig. 1 Flow diagram of the study design. A total of 2291 patients with complete relevant information were enrolled in this study and were randomized 3:1 and divided into a training set (N = 1718) and a validation set (N = 573) Foundation of China. In order to obtain the population, China was stratified into 7 geographic regions according to the traditional administrative district definition (north, northeast, northwest, middle, east, south and southwest); these regions extend over the majority of the country and represent different levels of breast cancer burden (Additional file 1: Figure S1) [13]. Of these patients, 4211 patients with a median age of 48 years (range: 21-86 years) were enrolled in this study. The age distribution of these patients conformed to 1999-2008 population-based breast cancer incidence data retrieved from the National Central Cancer Registry database (Additional file 1: Figure S2) [14]. One university hospital with good standard quality from each region was selected to provide the required study cases. Hospital records were reviewed by local clerks within each hospital according to the designated protocol. Further methodological details including patients selection, pathologic diagnostic criteria, data collection and quality control can be found in our previously published papers [13,15].
Among overall population of 4211 patients, 2291 patients with complete relevant information (age, BMI, location of lesions, local infiltration, T-stage, N-stage, ER, PR, HER2 and histologic type) were included in this study, which were further randomized 3:1 and divided into a training set (N = 1718) and a validation set (N = 573).
In this study, IHC method was used to determine the HER2 status, scores of 0 and 1+ be regarded as HER2-negative and that HER2 scores of 3+ be considered as HER2-positive. HER2 score of 2+ is regarded as HER2-borderline and HER2-borderline cases were excluded in our cohort due to low utilization rate of FISH assay during 1999-2008 in China.

Nomogram construction and validation
To develop a well-calibrated nomogram for predicting the probability of HER2 positivity, univariate as well as multivariate logistic regression analyses were performed to screen the predictors for HER2 positivity. Independent predictors (P < 0.05 in the multivariate logistic regression analysis) as well as clinical significant predictors were included in the nomogram construction. The Hosmer and Lemeshow test was applied to assess goodness of fit of the model, and P > 0.05 indicated a good fit. The odds ratios (ORs) and 95% confidence intervals (CIs) were also calculated.
The nomogram was validated internally in the training set and externally in the validation set. A receiver-operating characteristic (ROC) curve was used to evaluate the effectiveness of the nomogram. Area under the curve (AUC) was calculated. The AUC ranged from 0 to 1, with 1 indicating perfect concordance, 0.5 indicating no better than chance, and 0 indicating discordance. Statistical differences between different AUCs were investigated by the DeLong method. The calibration plot with bootstrapping was used to illustrate the association between the actual probability and the predicted probability. All reported P values are two-sided. The statistical analysis was carried out using SPSS (version 19.0; SPSS Company, Chicago, IL) and R software version 3.4 (http://www.r-project.org).

Clinicopathological features
There were 4211 breast cancer patients cataloged in the Nation-wide Multicenter 10-year Retrospective Clinical Epidemiological Study of Breast Cancer in China. After excluding 1920 patients with incomplete relevant information, 2291 eligible patients were included in the study which were randomized 3:1 and divided into a training set (N = 1718) and a validation set (N = 573) (Fig. 1). The clinicopathological features were similar between the training set and the validation set (Table 1).

Predictors for HER2 positive breast Cancer
We used logistic regression as univariate and multivariate analysis to evaluate the risk factors for HER2 positive breast cancers in the training set. T-stage, N-stage, ER and PR were risk factors for HER2 positive breast cancer in the univariate analysis, and these variables were further confirmed as independent risk factors in multivariate analysis (

Construction and validation of the nomogram
Based on the results of the multivariate analysis, a nomogram was constructed with independent predictors for HER2 status including T-stage, N-stage, ER status and PR status as well as histology (Fig. 2). A total points is the sum of points for each variable (top plotting scale) and the probability of HER2 positivity is the corresponding number of the total points in the nomogram (bottom plotting scale). The P-value of the Hosmer and Lemeshow test was 0.055, indicting a good statistical fit. The ROC curve analysis was performed to validate the nomogram internally in the training set (Fig. 3a) and externally in the validation set (Fig. 3b). In the training set, the AUC was 0.636 (95% CI: 0.607-0.665). In the validation set, the AUC was 0.681 (95% CI: 0.631-0.731). There's no statistical difference between two AUCs (P = 1). The calibration plot (Fig. 4) indicated that the nomogram was well calibrated.

The application of the nomogram
To display the application of the nomogram, we took ten patients who were tested for HER2 status in tissues from both the core needle biopsy and the excision as examples (Table 5). Patients 1-4 have concordant HER2 status in CNB specimens, excisional specimens and nomogram prediction. Patients 5 and 6 were HER2 Fig. 2 A nomogram to predict the probability of HER2 positive breast cancer positive in the CNB specimens, but HER2 negative in the excisional specimens which were concordant with our prediction done by the nomogram. Patient 7 was HER2 negative in the CNB specimen, but HER2 positive in the excisional specimen which was concordant with our prediction done by the nomogram. Patients 8-10 were HER2-borderline in the CNB specimens, by using the nomogram, prediction on HER2 status were consistent with the result in excisional specimens.

Discussion
In our current research, we first randomized the population containing 2291 patients with completed relevant information into a training set and a validation set. Based on the multivariate logistic regression analysis, we identified independent variables for predicting the HER2 status. Patients who were ER positive, PR positive were less likely to be HER2 positive than those who were ER negative, PR negative. Patients who were of T1 stage were more likely to be HER2 positive than those who were of T2, T3 and T4 stage. Patients with N2, N3 stage were more likely to be HER2 positive than those with N1 stage. Histology type did not show statistically significance in the multivariate logistic regression analysis. Next, we constructed a nomogram on the basis of these predictors as well as clinical significant predictors (histology). The AUCs in the training (internal validation) and validation sets (external validation) were 0.636 and 0.681, respectively.
Tissue diagnosis of breast cancer by CNB before proceeding to any kind of treatment is the gold standard of modern medical practice according to clinical practice guidelines. The accuracy of histologic identification by CNB is superior to the cytological diagnosis by fine needle aspiration and CNB specimens could provide more tissue for further immunohistochemical (IHC) assessment if neoadjuvant therapy is indicated. Accordingly, physicians can discuss the treatment plan with the patient preoperatively and give her the chance of one-step operation or perform the neoadjuvant treatment based on the assessment of ER, PR, HER2 and Ki67 status in the CNB samples. In cases of pCR, this will be the only source for the evaluation of breast cancer molecular subtype. It can also determine the molecular subtype more accurately than the IHC exam of an excised tumor with partial response to neoadjuvant chemotherapy.
Data from clinical trials have shown that patients with HER2 positive breast carcinomas have significantly better responses (more frequently obtaining pathologic complete response and greater percent disease-free survival) when treated with HER2 targeted therapy simultaneously with neoadjuvant chemotherapy than with neoadjuvant chemotherapy alone [17]. With the increasing use of neoadjuvant therapies, clinicians require accurate information on HER2 status at the time of CNB as false negative HER2 result might impair the potential benefit from HER2 targeted therapy for patients. Concordance rate of HER2 status between CNB and subsequent excisional biopsies of the same tumor varies from 81 to 96% as reported by different institutions [9][10][11][12]. The disconcordance between CNB and excisional biopsy were mainly due to intratumoral heterogeneity. Studies showed that to improve the accuracy of HER2 status as well as other important information from CNB, a minimum of four cores is required [18]. And when comparing 14-, 16-, 18-gauge needles, the accuracy rose with needles of increasing size [19]. These results suggested that diagnostic accuracy of CNB increased with the increase of harvested specimens. However, improving the accuracy of HER2 status in CNB specimens by increasing the cores and needle size is invasive and may cause more complications after CNB, so we developed noninvasive method -a nomogram to predict the HER2 status in the entire tumor prior to the surgery. If the HER2 status from CNB specimens is different from the nomogram predicted, then additional CNB may be needed to verify the exact status of HER2, in order to avoid inappropriate neoadjuvant therapy caused by false negative assessment of HER2.
In our study, we analyzed the common clinicopathological features including age, BMI, location of tumor, T-stage, local infiltration, N-stage, histology type, ER expression and PR expression in HER2-positive and -negative cohort, and found that ER negative, PR negative, T1 stage and N2/N3 stage were independent risk factors for HER2 positive breast cancer. Traina et al. demonstrated that HER2 overexpression was significantly correlated with negative hormone receptor (HR) status, positive nodal status and G3 tumor grade based on data from 1355 Italian breast cancer patients [20]. Study from Morocco which included 1508 patients found that overexpression of HER2 was associated with high tumor grade, vascular space invasion and ER negativity significantly [21]. Another study from a single center in China suggested that ER status, PR status and tumor grade were significantly associated with HER2 status [22]. Different studies from different institution indicated the same finding that HER2 overexpression is correlated with HR negativity, while the relationship between HER2 expression and other clinicopathologic parameters varies, such as tumor dimension, nodal involvement and tumor grading.
The nomogram, a simple graphical prediction tool, allows oncologists to assess the predictive risk of individuals [23]. And it's been considered as an important component of modern medical decision making [24]. Another advantage of the nomogram is that it is noninvasive. So far, this is the first nomogram constructed to predict HER2 status based on nation-wide multicenter data in breast cancer. The nomogram we constructed  here is not aimed to replace the molecular test of HER2, but to prevent the false negative of HER2 due to CNB sample limitation. According to the Youden's method [16], the optimal cutoff value of the training and validation set in this study were 0.212 and 0.204, at which the sensitivity were 80 and 81.1% while the specificity were only 43.1 and 49.8%, respectively. As we can't have both high sensitivity and high specificity, higher sensitivity is what we more needed considering that the purpose of our nomogram is to prevent false negativity. By using the optimal cutoff value and the nomogram, we predicted HER2 status correctly in several patients of different TNM stage and HR status. Since there're many rural areas in China that do not have sufficient resources and medical insurance coverage, our prediction model has practical value. For patients who cannot afford the FISH assay and trastuzumab therapy, our model will be helpful for predicting their HER2 status. If a patient with IHC-determined HER2-borderline disease were predicted to be HER2-positive and that patient could not afford trastuzumab, a stronger chemotherapy regimen, e.g., dose-dense AC-T, could be considered as an alternative to TC regimen. One major limitation of our study was that we couldn't analyze other important pathological parameters such as tumor grade and Ki67 due to incomplete information. The database consisted of breast cancer patients diagnosed during 1999 to 2008 when the pathologic diagnosis of breast cancer in China was rapidly developing and the parameters may vary in different institution at that time, so the tumor grade, Ki67 and other parameters which are proved of prognostic importance were not included in this retrospective study. Further effort is required to improving the database by adding new cases with complete information, so that we may find more risk factors for HER2 positive breast cancer and the nomogram may be adjusted to be more accurate, hence, patients could obtain more benefit from this nomogram by given the right therapy regimens.

Conclusions
The disconcordance of HER2 status between CNB and excisional biopsies of the same tumor may cause inappropriate neoadjuvant therapy in women with HER-2 positive breast cancer. Based on the nation-wide multicenter data spans 10 years in China, we found that ER negative, PR negative, T1 stage and N2/N3 stage were independent risk factors for HER2 positive breast cancer. We further establish the nomogram for HER2 status prediction which is validated both internally and externally. The nomogram could be a valuable tool for improving the accuracy of HER2 assessment pre-operatively. By combining CNB and the nomogram, clinicians could get more information of one individual patient and provide more suitable treatment accordingly.

Additional file
Additional file 1: Figure S1. Geographic distribution of sites included in the study 1 . Figure S2