Skip to main content

Visual and digital assessment of Ki-67 in breast cancer tissue - a comparison of methods

Abstract

Background

In breast cancer (BC) Ki-67 cut-off levels, counting methods and inter- and intraobserver variation are still unresolved. To reduce inter-laboratory differences, it has been proposed that cut-off levels for Ki-67 should be determined based on the in-house median of 500 counted tumour cell nuclei. Digital image analysis (DIA) has been proposed as a means to standardize assessment of Ki-67 staining in tumour tissue. In this study we compared digital and visual assessment (VA) of Ki-67 protein expression levels in full-face sections from a consecutive series of BCs. The aim was to identify the number of tumour cells necessary to count in order to reflect the growth potential of a given tumour in both methods, as measured by tumour grade, mitotic count and patient outcome.

Methods

A series of whole sections from 248 invasive carcinomas of no special type were immunohistochemically stained for Ki-67 and then assessed by VA and DIA. Five 100-cell increments were counted in hot spot areas using both VA and DIA. The median numbers of Ki-67 positive tumour cells were used to calculate cut-off levels for Low, Intermediate and High Ki-67 protein expression in both methods.

Results

We found that the percentage of Ki-67 positive tumour cells was higher in DIA compared to VA (medians after 500 tumour cells counted were 22.3% for VA and 30% for DIA). While the median Ki-67% values remained largely unchanged across the 100-cell increments for VA, median values were highest in the first 1-200 cells counted using DIA. We also found that the DIA100 High group identified the largest proportion of histopathological grade 3 tumours 70/101 (69.3%).

Conclusions

We show that assessment of Ki-67 in breast tumours using DIA identifies a greater proportion of cases with high Ki-67 levels compared to VA of the same tumours. Furthermore, we show that diagnostic cut-off levels should be calibrated appropriately on the introduction of new methodology.

Introduction

Sustained proliferative signalling is one of the hallmarks of cancer, as proposed by Hanahan and Weinberg in 2011 [1]. The nuclear antigen detected by the Ki-67 antibody is a marker of the growth fraction of a tumour. It is expressed in the G1, S, G2 and M phases of the cell cycle, but not in the resting phase, G0. While expression levels are low in G1 and S, they peak during G2 and M [2]. In breast cancer (BC), immunohistochemical (IHC) staining of the Ki-67 antigen is commonly used in the assessment of the proliferative activity of the tumour. It can provide information on prognosis and predict response to treatment in the adjuvant and neoadjuvant settings [3,4,5,6]. High Ki-67 score is associated with poor prognosis [7] but also a good response to chemotherapy [8, 9].

In molecular subtyping of BC, Ki-67 can be used to distinguish between Luminal A-like (Ki-67 low) and HER2 negative Luminal B-like (Ki-67 high) BC subtypes [10, 11]. While Luminal A patients generally have a good prognosis and may qualify for endocrine treatment only, Luminal B patients have a poorer prognosis and will often be given chemotherapy in addition. Thus, differentiation between these two subtypes has important therapeutic value [8, 10, 12].

Although the clinical validity of the Ki-67 Proliferation Index is accepted in BC, its clinical utility is still regarded as limited and there is a lack of consensus on the appropriate number of cells to count and cut-off levels for prognostication and treatment [13]. Furthermore, inter- and intra-observer agreement in the assessment of Ki-67 is poor [14,15,16,17,18,19].

Ki-67-staining is often heterogeneous within a tumour [20, 21]. In the assessment of Ki-67 IHC, only positively stained nuclei and mitotic figures should be scored, regardless of staining intensity, and between 500 and 1000 tumour cells should be counted in hotspot areas [22, 23]. According to the International Ki67 in Breast Cancer Working Group, Ki-67 levels between 5% and 30% are subject to considerable interobserver and interlaboratory variability. They suggest that only very low (< 5%) or very high (≥ 30) levels should be considered clinically actionable [13, 24]. To ameliorate issues of inter-laboratory variation, the 14th St. Gallen International Breast Cancer Conference in 2015 proposed that the in-house median value at each laboratory should be used to determine cut-off values due to interlaboratory differences [17].

Several studies have suggested the use of automated digital image analysis (DIA) to improve reproducibility in the assessment of Ki-67. With the introduction of DIA, it should be possible to redefine interpretation algorithms for biomarker assessment for both established clinical and novel biomarkers in BC, and address the issue of inter- and intraobserver variation in the interpretation of these biomarkers [15, 18, 19, 25,26,27,28,29].

In this study we compared visual assessment (VA) and DIA of tissue sections stained for Ki-67 in a consecutive series of BCs. The aim was to identify the number of tumour cells necessary to count in each method to reflect the growth potential of a given tumour, as measured by tumour grade, mitotic count and patient outcome.

Materials and methods

Study population

The study comprises 250 BCs from a larger series of BC patients. The background population from which this series arises comprises 25,727 women born between 1886 and 1928 in Nord-Trøndelag County in Norway, who were followed for BC occurrence from 1961 to 2008. In total, 1379 cases of BC were diagnosed during follow-up, and 909 of these tumours were classified into six molecular subtypes using IHC and chromogenic in situ hybridization (CISH) as surrogates for gene expression analysis [30]. After diagnosis, all patients were followed until death from BC, or death from other causes or until December 31st, 2015 [30, 31].

In the present study, we included 250 consecutive cases of invasive carcinoma of no special type [32]. Two cases were excluded due to unsatisfactory staining (Fig. 1).

Fig. 1
figure 1

Flowchart showing an overview of the cases included in this study

Immunohistochemistry

Full-face sections 4 μm thick, mounted on SuperFrost glass slides, were retrieved from storage (-20 °C). Paraffin was removed using TissueClear and sections were rehydrated with ethanol and water. Slides were heated at 60 °C for two hours and pretreated in a PT Link Pre-Treatment Module for Tissue Specimens (Dako Denmark A/S, 2600 Glostrup, DK) with a buffer (Low pH Target Retrieval Solution K8005) at 97 °C for 20 min. The Ki-67 antibody was applied (Clone MIB1, 35 mg/L, 1:100, Dako Denmark A/S, Glostrup, Denmark) in a DakoCytomation Autostainer Plus (Dako), with 40 min incubation time. Dako REAL™EnVision™ Detection System with Peroxidase/DAB+, Rabbit/Mouse (K5007), was used for visualization.

Digital image analysis

The IHC-stained slides were scanned at 40X magnification with a resolution of 0.23 μm/pixel using Hamamatsu NanoZoomer S360 Digital Slide scanner C13220-01 (Inter Instruments AS) at the Department of Pathology, St. Olav’s Hospital, Trondheim University Hospital, Norway. The digital images were analysed for Ki-67 protein expression using the open-source, DIA software QuPath v. 0.1.2 [27].

Training of the classifier

A separate series of 19 representative cases from the main cohort were used as a training set to train a two-class object classifier in QuPath after watershed nucleus detection [27]. The tumour area was delineated manually in the QuPath software. Cell nuclei (training objects) were selected and defined as either epithelial tumour cell nuclei or other (non-tumour cell nuclei or tumor stroma cell nuclei) in the whole slide images (WSI).

In the training set, stains were digitally separated using the colour deconvolution method and the automated “Estimate stain vectors” function in QuPath [27]. Watershed cell nucleus detection was performed and optimized visually using the following settings: Optical density (OD) sum; requested pixel size 0.4 μm; background radius 8.0 μm; median filter radius 1.5 μm; sigma 1.5 μm; min/max area 10/350 µm; threshold 0.02; maximum background intensity 3.0; and cell expansion 5 μm. Smoothing of object features (25, 50 and 100 μm) was applied. The threshold value for Ki-67-positivity (nucleus DAB OD mean) was assessed and adjusted manually, to best correspond to the visual perception of Ki-67 positivity in VA. Hence, the threshold was finally set to 0.15 nucleus DAB OD mean for all slides.

A cell nucleus detection object two-class Random Trees classifier (tumour cell nuclei vs. non-tumour cell nuclei) was trained using the default settings [27]. Training continued until visibly acceptable classification was achieved using 67% equally spaced train/test-split, resulting in approximately 85% accuracy. This was obtained using 7514 training objects and 135 object features from the 19 annotated images in the training set. The classifier was saved and applied to the watershed nucleus detections within the manually annotated tumor areas of all 248 cases in this study.

All nuclei in the tumour were detected by running positive cell nucleus detection provided by QuPath, and then sub-classified into epithelial tumour cell nuclei and other intra-tumoural nuclei by the trained classifier. Due to the heterogeneity of BC tissue, additional annotations were subsequently added to the classifier for most of the digital images until visually acceptable discrimination between epithelial tumour cell nuclei and all other nuclei was achieved for each WSI. Examples of annotation of training objects are shown in Fig. 2.

Fig. 2
figure 2

A Overview image from QuPath showing cell nucleus detection and classification. B Arrows indicate elongated stromal nucleus and lymphocyte (green); Ki-67 positive tumour cell nucleus (red) and Ki-67 negative tumour cell nucleus (blue)

Digital Ki-67 hotspot identification

The tumour area in each of the 248 full-face sections was delineated manually by an experienced breast pathologist and the manual delineation was thereafter used to guide digital delineation of the tumour in the WSIs in the QuPath software. Ki-67 positive tumour hotspot areas were identified using a semi-automated approach by generating measurement heat maps in QuPath by visualizing nucleus DAB OD mean: Smoothed 50 μm. The heat maps were manually adjusted for each WSI to identify and annotate the area with the highest density of Ki-67 positive tumour cell nuclei (Fig. 3A-D). Areas with obvious artefacts resulting in false hotspots were manually excluded.

Fig. 3
figure 3

A HES stained WSI; B IHC stained WSI (Ki-67) with manually delineated tumour area (red); C Cell detection within the tumor area and tumor (blue/red) and non-tumor (green) classified cells; D Measurement heat map showing Ki-67 hot-spots in red

Scoring and reporting

Visual assessment

Visual assessment of Ki-67 proliferation rate was done using a brightfield microscope (Nikon Eclipse 80i) at 40x magnification. A total of 500 tumour cell nuclei (5 × 100) were counted in visually selected hotspot areas in each case, starting with the group of 100 cell nuclei which appeared to have the highest proportion of Ki-67 positive cells. The number of positive-staining tumour cell nuclei was recorded separately for each 100-cell increment counted.

Digital image analysis

All cases were assessed for Ki-67 expression using the QuPath software. Once the Ki-67 tumor hotspot was identified using the measurement heat map, five areas containing 100 tumour cell nuclei were manually delineated using the QuPath “brush tool”. Counting started in the group of 100 nuclei that, within the identified hotspot, appeared to have the highest density of positive staining nuclei according to the heat map and continued in decreasing order of density until five sets of 100 nuclei were counted (Fig. 4).

Fig. 4
figure 4

A and B Hotspot identification and delineation images from QuPath. Areas of 100 tumour cell nuclei ordered from the area with the highest proportion of Ki-67 positive tumour cell nuclei [1] to the lowest [5]

Cut-off levels for Ki-67 Low/Intermediate/High positivity

We determined cut-off levels based on the median Ki-67 values for each method according to the St. Gallen International Expert Consensus on the Primary Therapy of Early Breast Cancer 2015 [17]. Ki-67 Low was defined as 10% points below the median, and Ki-67 High as 10% points above the median. Values falling between Low and High were classified as Intermediate. The median values of Ki-67 positivity using VA and DIA were calculated for 100 cells (VA100, DIA100); 200 cells (VA200, DIA200); 300 cells: (VA300, DIA300); 400 cells (VA400, DIA400); and 500 cells (VA500, DIA500) (Fig. 5). In the statistical analyses, only the results for VA/DIA100 and VA/DIA500 were used.

Fig. 5
figure 5

Median Ki-67 at each cumulative 100-cell increment for visual assessment (VA) and digital image analysis (DIA)

Statistical analyses

Tumour characteristics were compared using Pearson’s Chi squared test across categories of VA and DIA (Low, Intermediate and High as described above) for 100 and 500 nuclei counted. Bland-Altman plots were used to evaluate the agreement between VA500 as the reference measurement, and DIA100 and DIA500, by estimating the difference between the methods in relation to the mean. Cumulative incidence of death from BC was calculated for VA100, VA500, DIA100 and DIA500, treating death from other causes as competing events. Gray´s test was used to compare equality between cumulative incidence curves. Cox proportional hazard analyses were used to estimate hazard ratios (HR) of BC death, with censoring at death from other causes. Harrell’s C-test was used to compare the predictive ability of VA100, VA500, DIA100 and DIA500. All analyses were performed using Stata v. 16.0 (StataCorp LP, College Station, Texas, USA).

Results

Patient and tumour characteristics are presented in Table 1. Of the 248 patients evaluated in this study, 108 had died of BC and 124 had died of other causes by the end of follow-up. There were 16 (6.5%) histopathological grade 1, 131 (52.8%) grade 2, and 101 (40.7%) grade 3 tumours.

Table 1 Patient and tumour characteristics according to Ki67 visual assessment (VA) and digital image analysis (DIA) of full face tissue sections

Cut-off levels for Low/Intermediate/High Ki-67 positivity

Cut-off levels for Ki-67 positivity were calculated for both VA and DIA according to the median Ki-67 values after 500 tumour cell nuclei were counted (VA500, DIA500). The median Ki-67 level was 22.3% for VA500 and 30.0% for DIA500, as shown in Fig. 5. Thus, for the present study, cut-off levels for VA were set at < 12.3% (Ki-67 Low), ≥ 12.3 ≤ 32.3% (Ki-67 Intermediate) and > 32.3% (Ki-67 High). For DIA, cut-off levels were set at < 20.0% (Ki-67 Low), ≥ 20.0 ≤ 40.0% (Ki-67 Intermediate) and > 40.0% (Ki-67 High).

In VA, there was no clear difference between the median values of the five cumulative 100-cell increments (VA100-VA500) (range 22.3-23.2%). Using DIA, the median value for both DIA100 and DIA200 was 34.0%, falling to 30.0% at DIA500. Cumulative median values for all 100-cell increments (both VA and DIA) are shown in Fig. 5.

Visual assessment

Using the VA median-derived cut-off levels, 48 cases (19.4%) were classified as Ki-67 Low at VA100, falling to 44 (17.7%) at VA500. Twelve cases were upgraded from Ki-67 Low at VA100 to Ki-67 Intermediate at VA500. None were upgraded from Low to High. Similarly, a total of 123 cases (49.6%) were classified as Intermediate at VA100, rising to 132 (53.2%) at VA500. Eight of these cases were downgraded from Intermediate at VA100 to Low at VA500, and eight were upgraded to High. A total of 77 cases (31.4%) were classified as High at VA100 falling to 72 (29.0%) at VA500. Thirteen cases were downgraded from High at VA100 to Intermediate at VA500, and none were downgraded to Low (Fig. 6).

Fig. 6
figure 6

Number of cases in each Ki-67 category (Low, Intermediate and High) for each 100 cell-increment in A Visual assessment (VA) and B digital image analysis (DIA)

Digital image analysis

Using the DIA median-derived cut-off levels, 44 cases (17.7%) were classified as Low at DIA100, rising to 75 (30.2%) at DIA500. Thus, with increasing number of cells counted a further 31 cases (12.5%) were classified as Low. None were upgraded from Low to Intermediate at DIA500. One hundred and four cases (41.9%) were classified as Intermediate at DIA100, falling to 94 cases (37.9%) at DIA500. Thirty cases were downgraded from Intermediate at DIA100 to Low at DIA500. None were upgraded to High. One hundred cases (40.3%) were classified as High at DIA100, falling to 79 (31.9%) at DIA500. Twenty-six cases were downgraded from High at DIA100 to Intermediate at DIA500. None were downgraded from High to Low (Fig. 6).

The numbers of cases classified as Low were similar in VA100 (48 cases), VA500 (44 cases) and DIA100 (44 cases) but increased at DIA500 (75 cases). The number of cases classified as High was greatest at DIA100 (100 cases), falling to levels comparable with VA100 (77 cases) and VA500 (72 cases) at DIA500 (79 cases) (Table 1; Fig. 6).

Ki-67 and histopathological grade

Grade 1

Among the 16 Grade 1 tumours, six (37.5%) tumours were classified as Ki67 Low at VA500. Five cases were classified as Low at DIA100 rising to nine (56.3%) at DIA500 (Table 1).

Grade 2

Of the 131 Grade 2 tumours, 13 (9.9%) were classified as High at VA500. Using DIA, 30 (22.9%) were High at DIA100 falling to 21 (16%) at DIA500. A higher number of Grade 2 tumours were classified as Intermediate in VA compared to DIA (Table 1).

Grade 3

Of the 101 Grade 3 tumours, 59 (58.4%) were classified as High at VA500. Using DIA, 70 (69.3%) were High at DIA100, falling to 58 (57.4%) at DIA500. The number of Grade 3 tumours classified as Low was greatest at DIA500 (12 (16%)) (Table 1).

Ki-67 and mitotic count

There was a clear association (p < 0.001) between high mitotic count (> 14.5 mitoses/10 HPF) and Ki-67 High across all counting modalities. The highest number of cases were observed at DIA100 where 51 of 62 (82.3%) cases with high mitotic count were classified as Ki-67 High (Table 1).

Ki-67 and prognosis

There was no clear association between Ki-67 cell counts and risk of death. By the end of follow-up, 108 (43.5%) patients had died of BC.

For VA100 High, the cumulative risk of death from BC during the first five years after diagnosis was 32.5% (95% CI 23.3–44.2), and 46.8% (95% CI 36.4–58.5) 10 years after diagnosis.

For VA500 High, the corresponding risks were 37.5% (95% CI 27.5–49.7) and 48.6% (95% CI 37.8–60.7), respectively. Using VA500 Low as the reference, the rate of death from BC was unchanged for VA500 Intermediate but was higher for VA500 High (HR 1.94 ((95% CI 1.1–3.4))(Table 2; Fig. 7A).

Table 2 Risk of death from breast cancer according to Ki-67 level and counting procedures, expressed as cumulative incidence and hazard ratios of death from breast cancer
Fig. 7
figure 7

Cumulative incidence of death from breast cancer. A Visual assessment (VA), Grays’ test for VA100 (Low, Intermediate and High) P = 0.1062; Grays’ test for VA500 (Low, Intermediate and High) P = 0.0500. B Digital image analysis (DIA), Gray’s test for DIA100 (Low, Intermediate and High) P = 0.3335; Grays’ test for DIA500 (Low, Intermediate and High) P = 0.0796

For DIA100 High, the cumulative risk of death from BC during the first five years after diagnosis was 31.0% (95% CI 22.9–41.1) and after 10 years 44.0% (CI 34.9–54.3).

For DIA500 High, risk was 32.9% (CI 23.7–44.4) within the first five years, and 44.3% (CI 34.2–55.9) within the first 10 years.

Using DIA100 Low as the reference, the rate of death from BC was unchanged for DIA100 Intermediate but was higher for DIA100 High (HR 1.80 (95% CI 1.02–3.19), Table 2; Fig. 7B).

Comparison of methods

The Bland-Altman plots show that both DIA100 and DIA500 were clearly correlated to VA500. However, the mean values for Ki-67 using DIA (100 and 500) were on average higher than those for VA500, and the differences between DIA and VA500 increased with increasing mean values (Fig. 8). Harrell’s C test showed no clear difference in predictive ability between the VA and DIA methods. A Cox model including grade and DIA100 correctly predicted survival times in 61% of cases, compared to 60% of cases for models combining grade and any one of the other three methods (VA100, VA500 and DIA500).

Fig. 8
figure 8

Bland-Altman plots illustrating the agreement of the VA and DIA methods. A The difference between DIA100 and VA500 compared to the mean of DIA100 and VA500. B The difference between DIA500 and VA500 compared to the mean of VA500 and DIA500

Discussion

In this study we compared Ki-67 protein expression in IHC-stained BC tissue sections assessed by DIA using the QuPath platform, and by VA according to current recommended guidelines [22, 23]. We found that the median Ki-67 level was higher using DIA compared to VA. We show that while the proportion of Ki-67 positive tumour cells did not change substantially with increasing number of cells counted using VA, the number of cells counted did impact the result when using DIA. Furthermore, the highest proportion of patients with Ki-67 High tumours was found when 1-200 cells were counted using DIA. All counting methods predicted a poor prognosis for patients with the highest Ki-67 levels, but with little difference between the methods.

Gerdes proposed in 1984 that, with the help of the monoclonal antibody Ki-67, we now had a simple means of estimating the growth fraction of a given subset of human cells. This would be of particular interest in tumour diagnostics since the proportion of proliferating cells in given neoplasms would be of prognostic value and could contribute to the determination of treatment strategies [2]. Ki-67 is now used as a prognostic marker and may also be used as a predictive marker of response to chemotherapy [7,8,9]. There has been considerable debate regarding counting methods and cut-of levels for both prognostication and determination of treatment [10, 16, 33,34,35,36,37].

At the St. Gallen conference in 2015, it was proposed that the in-house median value at each laboratory should be used to determine cut-off values to offset interlaboratory differences [17]. More recently, the 17th St. Gallen International Breast Cancer Conference proposed that Ki67 should be used to determine treatment in estrogen receptor-negative, HER2-negative T1-2N0-1 BC in accordance with the International Ki67 Breast Cancer Working Group. The determination of cut-off levels is still challenging as reflected by these latest recommendations where only clearly low or clearly high levels of KI67 protein expression are considered to have clinical utility [13, 24]. Romero and co-workers suggested in 2014 a stepwise counting strategy without fixed denominators, especially to target heterogenetic tumours with some highly proliferative hotspots [29] and the International Ki67 Breast Cancer Working Group has proposed a standardized visual scoring method using a scoring app available online [13]. Thus, the need for a standardized approach in the IHC assessment of Ki-67 in BC has been recognized.

In this study, we found clear differences in the median levels of Ki-67 positivity between VA and DIA (VA500 (22.3%) and DIA500 (30%)) reflecting the respective method’s ability to identify hotspot areas in the tissue section. This is in agreement with previous studies [38,39,40,41]. Still, others have reported no real differences between the two methods [38, 41,42,43,44]. In the present study, the threshold set for OD sum in DIA and thus the ability to digitally detect positive Ki-67 staining, was set close to the pathologist’s threshold for positive staining before commencement of classifier training and digital assessment. The difference between the median values in VA and DIA, suggests that there is need for calibration of cut-off levels according to the method employed. The Bland-Altman plot [45, 46] shows that the methods perform quite similarly but that DIA in general reported higher levels of Ki-67 positivity compared to VA. Introduction of DIA for the assessment of Ki-67 in our hands would thus require recalibration of cut-off levels in order to correspond to established clinically actionable Ki-67 levels. This underlines the importance of understanding the consequences the introduction of a new method may have on patient treatment. However, Harrell’s C test [47] and risk-of-death analyses did not show any clear difference between methods in their ability to predict survival.

Recent studies have suggested that downgrading of Ki-67 levels in some tumors may occur in VA when more than 2-300 cells are counted [29, 48]. However, in the present study we found that there was little difference in the percentage of Ki-67 positive cells in each of the five 100-cell increments across cut-off levels using VA. This would imply that it may not be necessary to count more than 2-300 cells in VA. On the other hand, there was a clear fall in the number of Ki-67 High cases and a corresponding rise in the number of cases classified as Low with increasing cell counts using DIA. Thus, using DIA, the highest proportion of Ki-67 positive cell nuclei is achieved by counting 1-200 cells in digitally identified hotspots. This appears to be in agreement with Romero et al. [29]. In our hands, a significantly higher number of grade 3 tumours was found in DIA100 High compared to VA100 High, VA500 High and DIA500 High (p < 0.0001). Thus, we show that declining Ki-67 levels are more likely to occur using DIA compared to VA. A greater number of deaths from BC was seen at DIA100 Ki-67 High compared to DIA500 Ki-67 High (54 vs. 42 cases; 50.0% vs.38.9%). In comparison, for VA, the difference in the numbers of deaths from BC between the VA100 Ki-67 High group and VA500 Ki-67 High group were negligible (43 vs. 42 cases; 40.0% vs. 38.9%).

The cases included in our study were diagnosed with BC over a timespan extending from 1961 to 2008, and pre-analytical conditions may have varied. Ki-67 IHC is robust in formalin-fixed, paraffin-embedded tissue [49, 50] and antigenicity is well preserved, though staining intensity is prone to be reduced with increasing storage-time [51,52,53]. In the present study, staining intensity was not assessed. The international Ki-67 in Breast Cancer Working Groups has expressed concern about Ki-67 assessment of tissue stored in paraffin-blocks for more than five years, because of the degradation of the epitope in paraffin blocks. The exact mechanisms of the Ki-67 epitope degradation are not yet fully explored and there is still concern about the precision of the assessment. They recommend that the internationally standardized laboratory guidelines (ASCO and CAP) for HER2 and hormone receptors should also be applied to Ki-67 IHC [13]. Variation in tissue processing, staining reagents, laboratory protocols, and digitization procedures, may all contribute to variability in the interpretation of IHC in both conventional VA and DIA. Standardization of the preanalytical and analytical phases of tissue processing would greatly contribute to the creation of a more robust classifier for the digital analysis, although BC’s inherent heterogeneity would still remain a challenge [21, 54, 55]. In the present study, we included only invasive cancer (not otherwise specified). The classifier would require further development to reliably identify tumour cell nuclei morphologies such as those typical of lobular carcinoma. We found that some tissue slides were not suitable for DIA due to artefacts such as tissue folds, damaged tissue, or inadequate staining.

Studies comparing the QuPath platform with other digital analysis platforms have shown good reproducibility and functionality [38, 56]. One study comparing DIA using QuPath with VA shows that QuPath gave stronger prognostic stratification than the manual method [57]. The QuPath software was developed to improve the efficiency, objectivity, and reproducibility of digital histopathology, as well as biomarker analysis using digital images [27]. In the present study a greater number of cases were classified as either Low or High using QuPath DIA compared to conventional VA. Using the Ventana Virtuoso platform, Kwon et al., reported high concordance between VA and DIA, and stronger accuracy using DIA in the High Ki-67-group (≥ 20%) compared to the low Ki-67-group (≤ 10%). They also found that DIA is more useful in the borderline cases between cut-off levels citing observer variation as a greater challenge in these cases [55].

The initial regions of interest on the WSIs were manually delineated using the brush tool in QuPath. This approach was time-consuming, and automatic tissue detection or WSI annotation would be preferable. The first 100-cell increment counted by DIA was visually selected within the area of the tumour with the highest expression of Ki-67 in the heat map. To identify these hotspots, we created measurement maps for nucleus DAB OD mean with 50 μm smoothing. In this process we were aware that tissue folds, ink debris and abundant lymphocytes could result in higher OD in non-relevant areas. Thus, the measurement map method for detecting hotspots may not be suitable in sections with too many such irregularities and artefacts. We noted that membranous staining presented a greater challenge to the QuPath software than to experienced pathologists. A pathologist will ignore non-relevant staining, while the software will detect anything with color, unless the classifier is trained to ignore it.

In the present study, the QuPath-based DIA method entailed a considerable amount of manual adjustment, thus rendering it time-consuming and impractical for implementation in a clinical setting. Robertson et al. published a paper in 2020 that suggested that a digital global scoring of Ki-67 was a practical and clinically valid approach [58]. The International Ki67 in Breast Cancer Working Group discuss several methods including global score and hot spot score in addition to their own online scoring app giving a weighted global score based on the assessment of 100 cells in each of four areas in the tumour section (negligible, low, medium, or high). To the best of our knowledge, the latter has not achieved widespread acceptance. They point out that none of the current scoring systems achieved high analytical validity [13]. Global scoring was not evaluated in the present study. We chose to follow the guidelines for visual assessment of Ki-67 in BC currently in use in Norway, counting 500 cells in the area of the tumour with highest proliferation as assessed under the light microscope [23]. We used the same approach in the digital assessment. We acknowledge that this method may have drawbacks but in comparing the two methods our main finding remains that recalibration of cut-off levels is essential when introducing new methodology in the assessment of tissue biomarkers [23].

The number of cases in this study was limited and thus survival analyses should be interpreted with caution. Our results need to be validated in larger series of cases from other sources. However, the study clearly illustrates that new methodology in biomarker assessment requires recalibration of established cut-off levels.

Conclusions

In this study we show that assessment of Ki-67 in breast tumours using DIA identifies a greater proportion of cases with high Ki-67 levels compared to VA of the same tumours. Using VA, we found that the results do not change substantially with increasing number of cells counted. However, we propose that, using DIA, it may be sufficient to count 1-200 cells in a digitally selected hotspot area to identify the greatest number of cases with Ki-67 High tumours. Associations with survival should be interpreted with caution due to the limited number of cases and variation of pre-analytical conditions of the tissue samples in this study. Finally, our findings underline the need for recalibration of established cut-off levels on the introduction of new methodology.

Availability of data and materials

The datasets generated and/or analysed during the current study are not publicly available due to issues of sensitivity and limitations determined in the conditions for approval by the Regional Ethics Committee. However, they can be made available from the corresponding author on reasonable request.

Abbreviations

BC:

Breast cancer

IHC:

Immunohistochemistry/immunohistochemical

DIA:

Digital image analysis

VA:

Visual assessment

OD:

Optical density

WSI:

Whole slide images

CI:

Confidence interval

References

  1. Hanahan D, Weinberg RA. Hallmarks of cancer: the next generation. Cell. 2011;144(5):646–74.

    Article  CAS  PubMed  Google Scholar 

  2. Gerdes J, Lemke H, Baisch H, Wacker HH, Schwab U, Stein H. Cell cycle analysis of a cell proliferation-associated human nuclear antigen defined by the monoclonal antibody Ki-67. J Immunol. 1984;133(4):1710–5.

    CAS  PubMed  Google Scholar 

  3. Leung SCY, Nielsen TO, Zabaglo L, Arun I, Badve SS, Bane AL, et al. Analytical validation of a standardized scoring protocol for Ki67: phase 3 of an international multicenter collaboration. NPJ Breast Cancer. 2016;2:16014.

  4. Urruticoechea A, Smith IE, Dowsett M. Proliferation marker Ki-67 in early breast cancer. J Clin Oncol. 2005;23(28):7212–20.

    Article  CAS  PubMed  Google Scholar 

  5. Schwab U, Stein H, Gerdes J, Lemke H, Kirchner H, Schaadt M, et al. Production of a monoclonal antibody specific for Hodgkin and Sternberg-Reed cells of Hodgkin’s disease and a subset of normal lymphoid cells. Nature. 1982;299(5878):65–7.

    Article  CAS  PubMed  Google Scholar 

  6. Viale G, Giobbie-Hurder A, Regan MM, Coates AS, Mastropasqua MG, Dell’Orto P, et al. Prognostic and Predictive Value of Centrally Reviewed Ki-67 Labeling Index in Postmenopausal Women With Endocrine-Responsive Breast Cancer: Results From Breast International Group Trial 1–98 Comparing Adjuvant Tamoxifen With Letrozole. J Clin Oncol. 2008;26(34):5569–75.

    Article  PubMed  PubMed Central  Google Scholar 

  7. de Azambuja E, Cardoso F, de Castro G, Jr., Colozza M, Mano MS, Durbecq V, et al. Ki-67 as prognostic marker in early breast cancer: a meta-analysis of published studies involving 12,155 patients. Br J Cancer. 2007;96(10):1504–13.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  8. Criscitiello C, Disalvatore D, De Laurentiis M, Gelao L, Fumagalli L, Locatelli M, et al. High Ki-67 score is indicative of a greater benefit from adjuvant chemotherapy when added to endocrine therapy in luminal B HER2 negative and node-positive breast cancer. Breast. 2014;23(1):69–75.

    Article  PubMed  Google Scholar 

  9. Kim KI, Lee KH, Kim TR, Chun YS, Lee TH, Park HK. Ki-67 as a predictor of response to neoadjuvant chemotherapy in breast cancer patients. J Breast Cancer. 2014;17(1):40–6.

  10. Goldhirsch A, Wood WC, Coates AS, Gelber RD, Thurlimann B, Senn HJ, et al. Strategies for subtypes–dealing with the diversity of breast cancer: highlights of the St. Gallen International Expert Consensus on the Primary Therapy of Early Breast Cancer 2011. Ann Oncol. 2011;22(8):1736–47.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Aleskandarany MA, Green AR, Rakha EA, Mohammed RA, Elsheikh SE, Powe DG, et al. Growth fraction as a predictor of response to chemotherapy in node-negative breast cancer. Int J Cancer. 2010;126(7):1761–9.

    CAS  PubMed  Google Scholar 

  12. Prat A, Cheang MC, Martin M, Parker JS, Carrasco E, Caballero R, et al. Prognostic significance of progesterone receptor-positive tumor cells within immunohistochemically defined luminal A breast cancer. J Clin Oncol. 2013;31(2):203–9.

    Article  CAS  PubMed  Google Scholar 

  13. Nielsen TO, Leung SCY, Rimm DL, Dodson A, Acs B, Badve S, et al. Assessment of Ki67 in Breast Cancer: Updated Recommendations From the International Ki67 in Breast Cancer Working Group. J Natl Cancer Inst. 2021;113(7):808–19.

    Article  PubMed  CAS  Google Scholar 

  14. Varga Z, Diebold J, Dommann-Scherrer C, Frick H, Kaup D, Noske A, et al. How reliable is Ki-67 immunohistochemistry in grade 2 breast carcinomas? A QA study of the Swiss Working Group of Breast- and Gynecopathologists. PLoS One. 2012;7(5):e37379.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Laenkholm AV, Grabau D, Moller Talman ML, Balslev E, Bak Jylling AM, Tabor TP, et al. An inter-observer Ki67 reproducibility study applying two different assessment methods: on behalf of the Danish Scientific Committee of Pathology, Danish breast cancer cooperative group (DBCG). Acta Oncol. 2018;57(1):83–9.

    Article  PubMed  Google Scholar 

  16. Gallardo A, Garcia-Valdecasas B, Murata P, Teran R, Lopez L, Barnadas A, et al. Inverse relationship between Ki67 and survival in early luminal breast cancer: confirmation in a multivariate analysis. Breast Cancer Res Treat. 2018;167(1):31–7.

  17. Coates AS, Winer EP, Goldhirsch A, Gelber RD, Gnant M, Piccart-Gebhart M, et al. Tailoring therapies–improving the management of early breast cancer: St Gallen International Expert Consensus on the Primary Therapy of Early Breast Cancer 2015. Ann Oncol. 2015;26(8):1533–46.

  18. Focke CM, Burger H, van Diest PJ, Finsterbusch K, Glaser D, Korsching E, et al. Interlaboratory variability of Ki67 staining in breast cancer. Eur J Cancer. 2017;84:219–27.

    Article  CAS  PubMed  Google Scholar 

  19. Mengel M, Von Wasielewski R, Wiese B, Rüdiger T, Müller-Hermelink HK, Kreipe H. Inter-laboratory and inter-observer reproductibility of immunohistochemical assessment of the Ki-67 labelling index in a large multi-centre trial. J Pathol. 2002;198(3):292–9.

  20. Greer LT, Rosman M, Mylander WC, Hooke J, Kovatich A, Sawyer K, et al. Does Breast Tumor Heterogeneity Necessitate Further Immunohistochemical Staining on Surgical Specimens? J Am Coll Surg. 2013;216(2):239–51.

  21. Stalhammar G, Robertson S, Wedlund L, Lippert M, Rantalainen M, Bergh J, et al. Digital image analysis of Ki67 in hot spots is superior to both manual Ki67 and mitotic counts in breast cancer. Histopathology. 2018;72(6):974–89.

    Article  PubMed  Google Scholar 

  22. Dowsett M, Nielsen TO, A’Hern R, Bartlett J, Coombes RC, Cuzick J, et al. Assessment of Ki67 in breast cancer: recommendations from the International Ki67 in Breast Cancer working group. J Natl Cancer Inst. 2011;103(22):1656–64.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Helsedirektoratet NBCGN. Nasjonalt handlingsprogram med retningslinjer for diagnostikk, behandling og oppfølging av pasienter med brystkreft; page 38 and page 113. https://www.helsedirektoratet.no/retningslinjer/brystkreft-handlingsprogram: Helsedirektoratet, avdeling spesialisthelsetjenester; 2020 [updated 08/2020. IS-2945]. Available from: https://www.helsedirektoratet.no/retningslinjer/brystkreft-handlingsprogram.

  24. Reinert T, de Souza ABA, Sartori GP, Obst FM, Barrios CH. Highlights of the 17th St Gallen International Breast Cancer Conference 2021: customising local and systemic therapies. Ecancermedicalscience. 2021;15:1236.

    Article  PubMed  PubMed Central  Google Scholar 

  25. Gudlaugsson E, Klos J, Skaland I, Janssen EA, Smaaland R, Feng W, et al. Prognostic comparison of the proliferation markers (mitotic activity index, phosphohistone H3, Ki67), steroid receptors, HER2, high molecular weight cytokeratins and classical prognostic factors in T(1)(-)(2)N(0)M(0) breast cancer. Pol J Pathol. 2013;64(1):1–8.

    PubMed  Google Scholar 

  26. Volynskaya Z, Mete O, Pakbaz S, Al-Ghamdi D, Asa S. Ki67 quantitative interpretation: Insights using image analysis. J Pathol Inform. 2019;10(1):8-.

  27. Bankhead P, Loughrey MB, Fernandez JA, Dombrowski Y, McArt DG, Dunne PD, et al. QuPath: Open source software for digital pathology image analysis. Sci Rep. 2017;7(1):16878.

  28. Polley MY, Leung SC, McShane LM, Gao D, Hugh JC, Mastropasqua MG, et al. An international Ki67 reproducibility study. J Natl Cancer Inst. 2013;105(24):1897–906.

    Article  PubMed  PubMed Central  Google Scholar 

  29. Romero Q, Bendahl PO, Ferno M, Grabau D, Borgquist S. A novel model for Ki67 assessment in breast cancer. Diagn Pathol. 2014;9:118.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  30. Engstrom MJ, Opdahl S, Hagen AI, Romundstad PR, Akslen LA, Haugen OA, et al. Molecular subtypes, histopathological grade and survival in a historic cohort of breast cancer patients. Breast Cancer Res Treat. 2013;140(3):463–73.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Valla M, Vatten LJ, Engstrøm MJ, Haugen OA, Akslen LA, Bjørngaard JH, et al. Molecular Subtypes of Breast Cancer: Long-term Incidence Trends and Prognostic Differences. Cancer Epidemiol Biomarkers Prev. 2016;25(12):1625–34.

  32. (IARC) IAfRoC. WHO Classification of Tumours of the Breast. 4th ed. Lyon: IARC Publications; 2012.

  33. Alco G, Bozdogan A, Selamoglu D, Pilanci KN, Tuzlali S, Ordu C, et al. Clinical and histopathological factors associated with Ki-67 expression in breast cancer patients. Oncol Lett. 2015;9(3):1046–54.

    Article  PubMed  PubMed Central  Google Scholar 

  34. Untch M, Gerber B, Harbeck N, Jackisch C, Marschner N, Möbus V, et al. 13th st. Gallen international breast cancer conference 2013: primary therapy of early breast cancer evidence, controversies, consensus - opinion of a german team of experts (zurich 2013). Breast Care (Basel). 2013;8(3):221–9.

  35. Senn HJ. St. Gallen consensus 2013: optimizing and personalizing primary curative therapy of breast cancer worldwide. Breast Care (Basel). 2013;8(2):101.

    Article  Google Scholar 

  36. Gnant M, Thomssen C, Harbeck N. St. Gallen/Vienna 2015: A Brief Summary of the Consensus Discussion. Breast Care. 2015;10(2):124–30.

    Article  PubMed  PubMed Central  Google Scholar 

  37. Coates AS, Winer EP, Goldhirsch A, Gelber RD, Gnant M, Piccart-Gebhart M, et al. Tailoring therapies–improving the management of early breast cancer: St Gallen International Expert Consensus on the Primary Therapy of Early Breast Cancer 2015. Ann Oncol. 2015;26(8):1533–46.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Acs B, Pelekanou V, Bai Y, Martinez-Morilla S, Toki M, Leung SCY, et al. Ki67 reproducibility using digital image analysis: an inter-platform and inter-operator study. Lab Invest. 2019;99(1):107–17.

  39. Zhong FF, Bi R, Yu BH, Yang F, Yang WT, Shui RH. A Comparison of Visual Assessment and Automated Digital Image Analysis of Ki67 Labeling Index in Breast Cancer. Plos One. 2016;11(2):11.

    Google Scholar 

  40. Lea D, Gudlaugsson EG, Skaland I, Lillesand M, Soreide K, Soreide JA. Digital Image Analysis of the Proliferation Markers Ki67 and Phosphohistone H3 in Gastroenteropancreatic Neuroendocrine Neoplasms: Accuracy of Grading Compared With Routine Manual Hot Spot Evaluation of the Ki67 Index. Appl Immunohistochem Mol Morphol. 2021;29(7):499–505.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Koopman T, Buikema HJ, Hollema H, de Bock GH, van der Vegt B. Digital image analysis of Ki67 proliferation index in breast cancer using virtual dual staining on whole tissue sections: clinical validation and inter-platform agreement. Breast Cancer Res Treat. 2018;169(1):33–42.

  42. Laurinavicius A, Plancoulaine B, Laurinaviciene A, Herlin P, Meskauskas R, Baltrusaityte I, et al. A methodology to ensure and improve accuracy of Ki67 labelling index estimation by automated digital image analysis in breast cancer tissue. Breast Cancer Res. 2014;16(2):R35.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  43. Bankhead P, Fernandez JA, McArt DG, Boyle DP, Li G, Loughrey MB, et al. Integrated tumor identification and automated scoring minimizes pathologist involvement and provides new insights to key biomarkers in breast cancer. Lab Invest. 2018;98(1):15–26.

    Article  PubMed  Google Scholar 

  44. Egeland NG, Jonsdottir K, Lauridsen KL, Skaland I, Hjorth CF, Gudlaugsson EG, et al. Digital Image Analysis of Ki-67 Stained Tissue Microarrays and Recurrence in Tamoxifen-Treated Breast Cancer Patients. Clin Epidemiol. 2020;12:771–81.

    Article  PubMed  PubMed Central  Google Scholar 

  45. Altman DG, Bland JM. Measurement in Medicine: The Analysis of Method Comparison Studies. J R Stat Soc Ser D (The Statistician). 1983;32(3):307–17.

  46. Giavarina D. Understanding Bland Altman analysis. Biochem Med (Zagreb). 2015;25(2):141–51.

    Article  Google Scholar 

  47. Harrell FE Jr, Califf RM, Pryor DB, Lee KL, Rosati RA. Evaluating the yield of medical tests. JAMA. 1982;247(18):2543–6.

  48. Romero Q, Bendahl P-O, Klintman M, Loman N, Ingvar C, Rydén L, et al. Ki67 proliferation in core biopsies versus surgical samples - a model for neo-adjuvant breast cancer studies. BMC Cancer. 2011;11(1):341.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Benini E, Rao S, Daidone MG, Pilotti S, Silvestrini R. Immunoreactivity to MIB-1 in breast cancer: methodological assessment and comparison with other proliferation indices. Cell Prolif. 1997;30(3–4):107–15.

    Article  CAS  PubMed  Google Scholar 

  50. Arber DA. Effect of prolonged formalin fixation on the immunohistochemical reactivity of breast markers. Appl Immunohistochem Mol Morphol. 2002;10(2):183–6.

    PubMed  Google Scholar 

  51. Camp RL, Charette LA, Rimm DL. Validation of tissue microarray technology in breast carcinoma. Lab Invest. 2000;80(12):1943–9.

    Article  CAS  PubMed  Google Scholar 

  52. Cattoretti G, Becker MH, Key G, Duchrow M, Schluter C, Galle J, et al. Monoclonal antibodies against recombinant parts of the Ki-67 antigen (MIB 1 and MIB 3) detect proliferating cells in microwave-processed formalin-fixed paraffin sections. J Pathol. 1992;168(4):357–63.

    Article  CAS  PubMed  Google Scholar 

  53. DiVito KA, Charette LA, Rimm DL, Camp RL. Long-term preservation of antigenicity on tissue microarrays. Lab Invest. 2004;84(8):1071–8.

    Article  CAS  PubMed  Google Scholar 

  54. Roulot A, Héquet D, Guinebretière JM, Vincent-Salomon A, Lerebours F, Dubot C, et al. Tumoral heterogeneity of breast cancer. Ann Biol Clin (Paris). 2016;74(6):653–60.

    Google Scholar 

  55. Kwon AY, Park HY, Hyeon J, Nam SJ, Kim SW, Lee JE, et al. Practical approaches to automated digital image analysis of Ki-67 labeling index in 997 breast carcinomas and causes of discordance with visual assessment. PLoS One. 2019;14(2):e0212309.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Ribeiro GP, Endringer DC, De Andrade TU, Lenz D. Comparison between two programs for image analysis, machine learning and subsequent classification. Tissue Cell. 2019;58:12–6.

    Article  PubMed  Google Scholar 

  57. Loughrey MB, Bankhead P, Coleman HG, Hagan RS, Craig S, McCorry AMB, et al. Validation of the systematic scoring of immunohistochemically stained tumour tissue microarrays using QuPath digital image analysis. Histopathology. 2018;73(2):327–38.

    Article  PubMed  Google Scholar 

  58. Robertson S, Acs B, Lippert M, Hartman J. Prognostic potential of automated Ki67 evaluation in breast cancer: different hot spot definitions versus true global score. Breast Cancer Res Treat. 2020;183(1):161–75.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

The authors thank the Department of Pathology, St. Olav´s Hospital, Trondheim University Hospital for making the diagnostic archives available for the study and for digitizing histopathological slides, and the Cancer Registry of Norway for supplying the patient data.

Funding

This present study has received financial support from the Department of Clinical and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, Norway. Previous data that is also included int study received financial support from the Liaison Committee between the Central Norway Regional Health Authority and the Norwegian University of Science and Technology and The Research Council of Norway.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualisation: AMB, AHS. Methodology: AHS, HSP, AMB. Formal analysis: AHS, SO, MV, AMB. Original draft preparation: AHS, AMB. Manuscript review and editing: AMB, AHS, MV, SO, HSP. The author(s) read and approved the final manuscript.

Corresponding author

Correspondence to Anette H. Skjervold.

Ethics declarations

Ethics approval and consent to participate

Approval of this study was granted by the Regional Committee for Medical and Health Research Ethics, Central Norway (REK 836-09). The approval includes dispensation from the general requirement of patient consent.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Skjervold, A.H., Pettersen, H.S., Valla, M. et al. Visual and digital assessment of Ki-67 in breast cancer tissue - a comparison of methods. Diagn Pathol 17, 45 (2022). https://doi.org/10.1186/s13000-022-01225-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13000-022-01225-4

Keywords