Inter-observer variability between general pathologists and a specialist in breast pathology in the diagnosis of lobular neoplasia, columnar cell lesions, atypical ductal hyperplasia and ductal carcinoma in situ of the breast

Background This study aimed to assess inter-observer variability between the original diagnostic reports and later review by a specialist in breast pathology considering lobular neoplasias (LN), columnar cell lesions (CCL), atypical ductal hyperplasia (ADH), and ductal carcinoma in situ (DCIS) of the breast. Methods A retrospective, observational, cross-sectional study was conducted. A total of 610 breast specimens that had been formally sent for consultation and/or second opinions to the Breast Pathology Laboratory of Federal University of Minas Gerais were analysed between January 2005 and December 2010. The inter-observer variability between the original report and later review was compared regarding the diagnoses of LN, CCL, ADH, and DCIS. Statistical analyses were conducted using the Kappa index. Results Weak correlations were observed for the diagnoses of columnar cell change (CCC; Kappa = 0.38), columnar cell hyperplasia (CCH; Kappa = 0.32), while a moderate agreement (Kappa = 0.47) was observed for the diagnoses of flat epithelial atypia (FEA). Good agreement was observed in the diagnoses of atypical lobular hyperplasia (ALH; Kappa = 0.62) and lobular carcinoma in situ (LCIS; Kappa = 0.66). However, poor agreement was observed for the diagnoses of pleomorphic LCIS (Kappa = 0.22). Moderate agreement was observed for the diagnoses of ADH (Kappa = 0.44), low-grade DCIS (Kappa = 0.47), intermediate-grade DCIS (Kappa = 0.45), and DCIS with microinvasion (Kappa = 0.56). Good agreement was observed between the diagnoses of high-grade DCIS (Kappa = 0.68). Conclusions According to our data, the best diagnostic agreements were observed for high-grade DCIS, ALH, and LCIS. CCL without atypia and pleomorphic LCIS had the worst agreement indices. Virtual Slides The virtual slide(s) for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/1640072350119725.


Background
Despite advances in the understanding of the molecular biology of breast cancer progression and new molecular markers, the histopathological analysis remains the most widely used diagnostic method of precursor and intraductal proliferative lesions of the breast [1].
Currently, increasing number of breast lesions are discovered during the pre-clinical phase due to the more widespread use of mammography screening and the incorporation of new imaging technologies for the diagnosis of breast cancer. There has also been an increase in the diagnosis of intraductal proliferative and precursor breast lesions, which exhibit uncertain behaviour. These include lobular neoplasia (LN), columnar cell lesions (CCL), atypical ductal hyperplasia (ADH), and ductal carcinoma in situ (DCIS). The differential histologic diagnosis between some of these lesions can be difficult and presents challenges to pathologists; especially those not specialized in breast pathology [2,3].
Reproducibility studies are useful when evaluating the applicability of histological criteria for the classification of breast lesions and when determining the level of agreement amongst pathologists regarding morphological diagnoses. Studies conducted by our group have revealed significant inter-observer variability between the diagnoses made by general pathologists and those made by breast pathology experts in the diagnosis for DCIS and ADH; this discrepancy could have significant therapeutic implications [4,5]. Although there have been various studies on the diagnostic agreement considering DCIS, few studies have analysed the diagnostic agreement considering LN and CCL [6,7].
Our study aimed to assess the frequency of detection rate of precursor lesions and intraductal proliferative lesions, primarily CCL and LN, in breast biopsies sent for consultation as well as the inter-observer variability in the diagnoses made during the original report and a later review by a specialist consultant in breast pathology.

Methods
A retrospective, observational, cross-sectional study was conducted. Files from the Breast Pathology Laboratory at the School of Medicine of Federal University of Minas Gerais (UFMG), Brazil, were reviewed between January 2005 and December 2010, and 673 cases of breast lesions were identified as having been formally sent for consultation or second opinion. The analysed data were obtained from the original pathologist reports and from the consulting report conducted by a single pathologist (HG) with an expertise on breast pathology. A total of 63 cases were excluded from the analysis; these cases did not have the original reports for comparison or they had insufficient and/or damaged material that prevented the review.
Data were collected through the use of a structured form, and the following items were analysed in both the original report and the review: type of specimen, specialty of the referring physician, and presence of intraductal proliferative lesions (columnar cell lesions [CCL], ADH, and DCIS) and the LN (atypical lobular hyperplasia [ALH], lobular carcinoma in situ [LCIS], and pleomorphic LCIS) associated or not with invasive carcinoma.
The histological classification of LN originally reported by Page et al. and adopted by the 2012 World Health Organization (WHO) Classification of Breast Tumours was used [1]. Lobular neoplasia refers to the entire spectrum of atypical epithelial lesions originating in the terminal-duct lobular unit (TDLU) and characterized by a proliferation of generally small, non-cohesive cells. ALH is defined as a filling or expansion of less than 50% of the acini in one or more lobular units by proliferating small, uniform cells. LCIS was defined as a filling and distension of greater than 50% of the acini of a lobular unity and a loss of the residual intracellular lumen [8,9]. The criteria used to diagnose pleomorphic LCIS were those originally described by Eusebi et al., which included the same architectural configuration as LCIS but with increased nuclear pleomorphism, larger nucleoli, with or without comedo necrosis [10].
CCLs are a group of lesions of the terminal ductallobular units that are characterized by variably enlarged dilated acini lined by columnar epithelial cells without cytological or architectural atipia [1]. Flat epithelial atypia (FEA), the term adopted by the WHO Classification of Breast Tumours, since 2003, refers to a neoplastic alteration of the TDLUs characterized by replacement of the native epithelial cells by one to several layers of a single epithelial cell type showing low-grade (monomorphic) cytological atypia [1,11]. These lesions differ from those with sufficient architectural and cytological findings for a differential diagnosis of ADH or DCIS. In the present study, we used the diagnostic criteria proposed by Schnitt and Vincent-Salomon, who previously suggested the classification of the FEA group into two groups, CCC and columnar cell hyperplasia (CCH) with atypia according to the number of layers of proliferating epithelial cells (Table 1) [12,13].
ADH was defined as a proliferation of regularly distributed monomorphic cells to form regular, uniform, and circular secondary lumens. These lesions are small, and the cells have 2 partial ducts or "spaces" involved, and are less than 2 mm in size. DCIS is characterised as an epithelial proliferation of atypical cells with 2 complete "spaces" or ducts involved or are more than 2 mm in overall size. The criteria of Elston and Ellis were used to diagnose DCIS with microinvasion [14], which are recognised by predominant DCIS as well as the infiltration of neoplastic cells beyond the basal membrane of the unspecialised or extralobular connective tissue up to 1 mm in size. The histological grades (low, moderate, and high) of DCIS were determined after considering the grade of nuclear atypia as well as the presence and extension of necrosis according to the criteria of Lagios [15]. Currently, the Breast Pathology Laboratory uses the new WHO (2012) Classification for Breast Tumours as a diagnostic reference [1].
To tabulate the data, cases with more than one type of breast lesion were classified according to the lesion with the greatest risk or potential to develop into a carcinoma. For LN, the risk classification was as follows: pleomorphic LCIS > LCIS > ALH; for CCL: CCH atypia > CCC atypia > CCH > CCC; and for DCIS: microinvasive DCIS > highgrade DCIS > intermediate-grade DCIS > low-grade DCIS. Cases of DCIS for which the grade had not been evaluated during the original report or the review were classified as unspecified (US).
The SPSS program (version 17.0; SPSS Inc., Chicago, IL, USA) was used to analyse the inter-observer variability between the original diagnosis and the histopathological review conducted by the consultant pathologist, using the Kappa index. This index was interpreted according to the following values proposed by Landis and Koch [16]: < 0.20 (bad); 0.21-0.40 (poor); 0.41-0.60 (moderate); 0.61-0.80 (good); and 0.81-1.00 (excellent). The significance level (p) was defined as 0.05. This study was approved by the Research Ethic's Committee of the UFMG.

Results
A total of 610 cases of breast lesions that had been referred for second opinion and satisfied the inclusion criteria were analysed; of these, 56.9% were breast specimens from segmental mastectomies or quadrantectomies, 29.8% from core-biopsies, 5.9% from mastectomies, and 7.4% were other specimen types. The referring practitioner was specialised in breast surgery in 60% of the cases, in oncology in 5.2% of the cases, and in pathology in 4.1% of the cases. Patients' ages ranged from 13-94 years with a mean age of 53.6 years (±13.3 years).
CCL were present in 14.4% (88/610) of the original reports and 25.1% (153/610) of the reviews. Of 88 cases from the original reports, CCC were present in 64.8%, CCH in 12.5%, CCC with atypia in 8.0%, and CCH with atypia in 14.8% of the cases. Of the 153 cases from the reviews, CCC were present in 74.5%, CCH in 9.8%, CCC with atypia in 6.5%, and CCH with atypia in 9.2% of the cases (Table 3; Figure 2). There were weak diagnostic agreements between the original report and later review for CCC (Kappa index = 0.38; [p < 0,05]), CCH (Kappa index = 0.32). The agreement was moderate (Kappa index = 0.47; [p < 0,05]) between the diagnoses of FEA (CCC with atypia and CCH with atypia).
ADH was present in 12.1% (74/610) of the original reports and 8.4% (51/610) of the later reviews (Table 4; Figure 2B and Figure 3). There was a moderate agreement between the original reports and later reviews regarding the diagnosis of ADH (Kappa index = 0.44; [p < 0,05]). Of the 74 cases of ADH present in the original reports, in 41.9% (31/74) the reviewer confirmed the diagnosis of ADH. In 58.1% (43/74) cases the ADH was over-diagnosed, these, 58.1% (25/43) the reviewer downgrade diagnosis for usual ductal hyperplasia, in 14.0% (6/43) the diagnosis was increased to DCIS, and 27.8% (12/43) could not evaluate this information only by the reports.
The DCIS frequencies in the original reports and later reviews were 37   Apical decapitation Often present, not usually prominent.
Often present, might be exaggerated.
Often present, might be exaggerated.
Often present, might be exaggerated.

Intraluminal secretions
Might be present but are not usually prominent.
Might be present and prominent.
Might be present and prominent.
Might be present and prominent.

Calcifications
Might be present Usually present, might be psammomatous.
Usually present, might be psammomatous.
Usually present, might be psammomatous.

Discussion
In this study, we analysed the LN, CCL, ADH and DCIS diagnostic agreements and reproducibility between general pathologists and a specialist pathologist with training and experience in breast pathology in cases received in consultation for a second opinion. The importance of LN morphological classification is attributed to the Page group and collaborators, who correlated lobular lesion extension with patient evolution. The risks of developing invasive carcinomas were calculated as 4-5% for ALH and 8-11% for LCIS [8,9]. Eusebi et al. described the pleomorphic variant of LCIS; this variant features the same architectural arrangement as LCIS but exhibits marked nuclear pleomorphism and abundant cytoplasm, more evident nucleoli, and possible areas of comedo necrosis and microcalcifications either with or without apocrine characteristics [10]. Given the morphological characteristics in association with the fact that the imunohistochemical profile of pleomorphic LCIS is more likely oestrogen receptor-negative and HER-2-positive along with a higher Ki-67 proliferation index that classic LCIS, these lesions have been correlated with a more aggressive biological behavior than that of classic LCIS; however, epidemiological studies to prove this assumption are lacking [1,10]. Recently, molecular analyses of synchronous LCIS and both classic ILC type and the pleomorphic variant of invasive lobular carcinoma demonstrated similarities in the genomic profiles [17]. LCIS is considered both a risk factor and a non-obligate precursor lesion for subsequent invasive carcinomas in either breast, of either ductal or lobular type, but only a minority of women actually develop invasive breast cancer after a long-term follow up [1].    In our study, a good diagnostic agreement was observed between the reports from generalist pathologists and a breast pathology specialist regarding the diagnoses of ALH (Kappa index = 0.62) and LCIS (Kappa index = 0.66). Our data are similar to those of Fitzgibbons, who analysed the responses of 2,952 pathologists to clinical cases from The College of American Pathologists Performance Improvement Program in Surgical Pathology [6]. That study analysed the agreement (%) regarding the diagnosis of ALH; 58% of the pathologists correctly diagnosed ALH, whereas 17% diagnosed LCIS. When LN (ALH and LCIS) cases were assessed together, the agreement rate was 74%. However, there were other conflicting diagnoses for these cases, including ADH (14%), DCIS (1.4%), and usual ductal hyperplasia (10%) [6]. In fact, the differential diagnosis of LN and intraductal proliferative lesions can be difficult, especially when concerning classic LN versus low-grade solid DCIS, and pleomorphic LCIS versus high-grade DCIS [2,18]. Despite the low number of pleomorphic LCIS cases in our study, the interobserver agreement was poor (Kappa index = 0.22). The immunophenotypic criteria of E-cadherin, β-catenin, and p120-catenin expression in combination with the careful identification of cytological and architectural alterations are useful tools in the morphological classification of these lesions [18,19].
The correct diagnosis of LN cases will affect the treatment options and counseling. Patients with LN are at risk of developing invasive ipsilateral and contralateral breast carcinomas. For this reason, most patients diagnosed with LN are clinically monitored, and tamoxifen might be administered as a prophylactic therapy against the development of invasive carcinomas [20]. In very specific cases in which there are other associated risk factors, a bilateral prophylactic mastectomy might be offered [21]. The management of LN after core-biopsy diagnoses remains controversial. Complete excision of the lesion is recommended in patients who have been diagnosed with various forms of LCIS; however, the current evidence does not support the routine excision of conventional LCIS diagnosed via core-biopsy in cases with a clinical-radiological correlation and in which suspected area on the image has been properly sampled [1,2].
CCL and FEA are a group of breast lesions whose diagnostic criteria were defined only in recent years [12]. With the widespread use of mammography screening, CCLs have often been identified in breast biopsies and are present in as many as half of the biopsies performed for microcalcifications detected via mammography. Recent studies have shown that flat epithelial atypia shares genetic similarities with ADH, low-grade DCIS, and tubular carcinomas, suggesting that these lesions act as precursors of invasive, low-grade carcinomas [22].
Despite advances in genetic studies of these lesions, few studies have assessed the diagnostic reproducibility of CCL amongst pathologists. In our study, we observed weak diagnostic agreements regarding CCC and CCH between generalist pathologists and a breast pathology specialist. When we assessed FEA, the agreement was moderate with better agreement for lesions with more pronounced atypia. Our data differed from those of O'Malley et al. [7], who observed excellent agreement (Kappa index = 0.83) regarding the diagnoses of CCL without atypia and FEA. The agreement was better when detecting the absence of FEA (92.8%) than when confirming its presence (90.4%). However, in contrast to our study, the agreement was assessed in selected cases and amongst pathologists experienced in breast pathology using images of the cases with pre-established diagnostic criteria [7]. Haupt et al. analysed the diagnostic agreement regarding previously selected CCL cases between residents and fellows both before and after conducting a tutorial on the diagnostic criteria of CCL. Before conducting the tutorial, the diagnostic agreement of FEA was weak (Kappa index = 0.39); after the training, there was a statistically significant increase in the ability to recognise FEA (Kappa index = 0.60) [23]. A similar study was conducted by Tan et al., who analysed the diagnostic agreement of CCL amongst pathologists from the same department after a tutorial; the agreement obtained varied from weak to moderate (Kappa index range = 0.44-0.71) between the group of pathologists and the pathologist tutor [24].
The clinical importance of a correct diagnosis of FEA is that these lesions often coexist alongside LN, ADH, lowgrade DCIS, and low-grade invasive carcinomas such as tubular carcinomas [22,25]. In a recent meta-analysis, Verschuur-Maes et al. analysed 24 studies that reported the presence of carcinoma in situ after diagnosing CCL in needle biopsies. DCIS underestimation rates of 1.5%, 9%, and 20% were observed in cases of CCL without atypia, CCL with atypia, and CCL associated with ADH, respectively [26]. However, given the limitations of the studies and the large variation in the time of follow-up after the initial biopsy, the WHO Classification of Breast Tumours (2012) notes that it remains uncertain whether the indication of surgical excision is necessary after a diagnosis of FEA via core needle biopsy. Radiological and pathological correlations are recommended to determine the subsequent procedure [1]. Moreover, epidemiological studies such as that conducted by Boulos et al. have revealed that CCL are associated with only a relative 1.55-fold risk of developing invasive carcinomas in subsequent years; however, the risk associated with these lesions is not entirely independent of the risk associated with other concomitant proliferative lesions [27].
ADH and DCIS comprise 10% and 15-20%, respectively, of all breast lesions detected using mammography screening. The relative risk of developing invasive carcinoma ranges from 4 to 5-fold among patients with a diagnosis of ADH and from 8-to 10-fold among patients with DCIS [1,28]. However, the histopathological diagnoses of these intraductal proliferative lesions may be difficult, and the inter-and intra-observer agreements between the pathologists differ [5,29,30]. This diagnostic inconsistency could be primarily a result of the morphological criteria used to classify and appropriately select the diagnostic fields [30,31]. The correct diagnosis and the differentiation between ADH and DCIS have repercussions for the treatment of these lesions. When diagnosed via core biopsy, ADH lesions must be completely removed to search for DCIS in the excision specimen to avoid a missed detection of an invasive component. In cases in which the diagnosis of ADH was upheld after an extended biopsy, no further treatment is necessary. However, given the greater risk of progressing to an invasive carcinoma, DCIS has been treated by complete excision of the lesion with free margins, with complementary radiotherapy in cases for which breast-conserving surgery has been performed, and the use of tamoxifen as a prophylaxis against local recurrence [32].
In our study, the diagnostic agreements between the original reports and later reviews were moderate for ADH analysed the level of inter-observer agreement in the diagnoses of ADH and DCIS amongst 23 pathologists who used pre-defined diagnostic criteria. In this study, the Kappa indices were considered poor (0.35) for ADH and good (0.78) for DCIS. However, when DCIS cases were stratified by histological grades, significant variations were observed in the inter-observer diagnoses, as the Kappa indices were 0.51 for low-grade DCIS, 0.19 for intermediategrade DCIS, and 0.41 for high-grade DCIS [31]. However, in our study, we obtained a better agreement for the diagnosis of high-grade DCIS (Kappa index = 0.68). Similar findings have been described by other authors that conducted studies on the inter-observer variability with regard to the nuclear grade of DCIS. Sneige et al. evaluated the inter-observer variability among six pathologists who assessed 125 cases of DCIS and observed a better diagnostic agreement for high-grade DCIS [33,34]. The nuclear grade of DCIS is an important factor when determining the therapeutic approach because high nuclear-grade lesions are associated with a poor prognosis and are often associated with local recurrence and/or progression into invasive lesions, a greater chance of metastasis, and greater required care during local surgical procedures [4,28].
Various strategies have been used in an attempt to improve the diagnoses of ADH and DCIS, including a review of the diagnostic criteria and continuing education programs [33]. Recently, Jain et al. revealed that the agreement between nine pathologists regarding the diagnosis of ADH was poor (Kappa index = 0.34) and that an immunohistochemical analysis of cytokeratin (5, 14, 7 and 18) and p63 protein expression significantly improved the level of agreement among pathologists (Kappa index = 0.5) [30]. The influence of automated methods of interpretation [35] and the use of telepathology and virtual slides has been evaluated to improve the accuracy of diagnosis and as a tool for education, quality control, and second opinion in pathology [36,37].
As our study was retrospective, it includes some limitations. Proliferative lesions may or may not be associated with invasive carcinomas, a factor that might cause the generalist pathologist to underreport a case when faced with a more aggressive diagnosis. Although our data were based on pathological reports, upon assessing the aims of this study, we believe that this methodology is similar to that used in clinical practice.
Our study was the first to assess the diagnostic agreement regarding CCL and LN in cases that were sent for consultation according to reports by generalist pathologists. Interestingly, in our series, a total of 60% of cases were sent for consultation by breast surgeons and 5.2% of cases were sent by oncologists. Only 4.1% of cases were sent for consultation by pathologists. We believe that formal requests for second opinions regarding precursor and borderline breast lesions should be encouraged, especially amongst general pathologists, with the aim of reducing errors in diagnosis and thereby assuring appropriate therapeutic conduct and guaranteeing patient safety [4,5].

Conclusions
Our findings show a low degree of inter-observer diagnostic agreement between generalist pathologists and a specialist in breast pathology with regard to CCL without atypia and pleomorphic LCIS, moderate agreement for FEA, ADH, and low-grade, intermediate, and microinvasive DCIS, and good agreement for high-grade DCIS, ALH, and LCIS. We believe that the use of standardised diagnostic criteria and specific training in breast pathology might improve the reproducibility of these diagnoses, thereby improving the reliability of the pathological reports in the definition of the best therapeutic approach for each patient.