- Open Access
Evaluation of bone marrow aspirates in patients with acute myeloid leukemia at day 14 of induction therapy
Diagnostic Pathology volume 10, Article number: 122 (2015)
Early assessment of response to chemotherapy in acute myeloid leukemia may be performed by examining bone marrow aspirate (BMA) or biopsy (BMB); a hypocellular bone marrow sample indicates adequate anti-leukemic activity. We sought to evaluate the quantitative and qualitative assessment of BMA performed on day 14 (D14) of chemotherapy, to verify the inter-observer agreement, to compare the results of BMA and BMB, and to evaluate the impact of D14 blast clearance on the overall survival (OS).
A total of 107 patients who received standard induction chemotherapy and had bone marrow samples were included. BMA evaluation was performed by two observers using two methods: quantitative assessment and a qualitative (Likert) scale. ROC curves were obtained correlating the BMA quantification of blasts and the qualitative scale, by both observers, with BMB result as gold-standard.
There was a significant agreement between the two observers in both the qualitative and quantitative assessments (Kw = 0.737, p < 0.001, and rs = 0.798, p < 0.001; ICC = 0.836, p < 0.001, respectively). The areas under the curve (AUC) were 0.924 and 0.946 for observer 1 and 0.867 and 0.870 for observer 2 for assessments of the percentage of blasts and qualitative scale, respectively. The best cutoff for blast percentage in BMA was 6 % and 7 % for observers 1 and 2, respectively. A similar analysis for the qualitative scale showed the best cutoff as “probably infiltrated”. Patients who attained higher grades of cytoreduction on D14 had better OS.
Evaluation of D14 BMA using both methods had a significant agreement with BMB and between observers, identifying a population of patients with poor outcome.
The outcome of patients with acute myeloid leukemia (AML) has improved substantially over the past decades, thanks to the development of more aggressive therapies and better supportive care. However, a substantial proportion of patients still do not obtain complete remission (CR), and others eventually relapse after achieving CR [1–3]. In an attempt to stratify subgroups with different survival rates, several prognostic factors have been identified, including age, gender, baseline white blood cell count, lactic dehydrogenase serum level, immunophenotype, karyotypic abnormalities and genetic profiles [4–7].
In addition to baseline variables, early assessment of response to chemotherapy may help to define prognosis. Previous studies have shown an association between the lack of early blasts clearance and failure to obtain CR after a first cycle of induction [8, 9]. This early assessment of treatment response is usually performed between the 14th (D14) and 17th day of the first cycle of induction chemotherapy, by analyzing the cellular content of the bone marrow aspirate (BMA) and/or biopsy (BMB). A hypocellular bone marrow sample suggests adequate anti-leukemic activity [8, 10]. However, its interpretation may be inaccurate because of different levels of expertise among pathologists and hematologists, and a great variability in BMA and BMB sample quality . Furthermore, a BMA blast count above which poor response to chemotherapy is predicted has not been clearly defined, with values ranging from 5 % to 40 % [8–19]. By contrast, the BMB provides a better assessment of marrow cellularity , but the results are available only a few days after the BMA, delaying the decision to administer a second course of induction chemotherapy for non-responders.
Given these uncertainties, we sought to evaluate the quantitative and qualitative assessment of D14 BMA, to verify the inter-observer agreement, and to compare the results of BMA and BMB. We also assessed the impact of D14 blast clearance on the overall survival (OS).
Study population and treatment
All patients diagnosed with AML at University Hospital Clementino Fraga Filho, Universidade Federal do Rio de Janeiro (UFRJ) Brazil, from January 1979 to December 2008 were retrospectively evaluated. Entry criteria for this study included: a diagnosis of AML other than acute promyelocytic leukemia, no previous treatment in other institution, receipt of standard induction chemotherapy (cytarabine + antracycline), and performance of BMA on D14 of induction chemotherapy. The study was approved by the local ethics committee (Hospital Clementino Fraga Filho/Universidade Federal do Rio de Janeiro, CAAE n°. 0094.0.197.000-09) and was conducted in accordance with the principles of Helsinki declaration. Informed consent was not obtained due to its retrospective nature of this study did not affect the healthcare of the included individuals. Moreover, confidentiality was preserved.
The diagnosis of AML was based on available procedures at the time, including BMA and BMB, and cytogenetic and immunophenotype analyses. Cases were classified according to de French-American-British (FAB) criteria . The treatment regimens changed over time (Table 1) .
Bone marrow aspirate and biopsy
Routine assessments of BMA and BMB were performed on D14 of induction remission. Aspirate smears were prepared at the bedside and stained with Wright-Giemsa stain, and biopsy samples were fixed in 10 % buffered formalin, and stained with hematoxylin and eosin. Patients with persistent disease according to D14 assessment received a second cycle of induction as early as possible [2, 13]. All glass slides were kept in storage units in the hospital achieves.
We reviewed all available slides from BMA performed at diagnosis and on D14. The analysis was performed by two independent observers (board certified hematologists), blinded for patient identification and outcome. The evaluation included confirmation of the initial diagnosis of AML and identification of D14 residual leukemia in a quantitative (percentage) and qualitative (scale) manner. Quantitative evaluation was performed by counting the percentage of blasts in 200 nucleated marrow cells. The qualitative assessment was determined by stratification in a Likert scale  of five categories: definitely infiltrated, probably infiltrated, doubtful, probably free and definitely free.
The results of D14 BMB were obtained by reviewing patients’ medical records and registries from the Pathology Service of the hospital. The reports were categorized as aplastic (leukemia free) or infiltrated.
The qualitative assessment of blasts was first treated as an ordinal categorical variable and latter grouped in two categories, and treated as dichotomous categorical variable. Agreement between the two observers was assessed using the kappa coefficient (Cohen’s kappa) and quadratic weighted kappa coefficient (Kw). The kappa coefficient may range from −1 (complete disagreement) to +1 (complete agreement) and the correlation is usually classified as poor (below 0), mild (0 to 0.2), low (0.21 to 0.4), moderate (from 0.41 to 0.6) substantial (0.61 to 0.8) and almost perfect (0.81 to 1.00) . Further evaluation of the marginal homogeneity of proportions was performed with the McNemar test for dichotomous categorical variables and the McNemar modified test for ordinal categorical variables. In both tests, the presence of a significant p value (<0.05) indicates excessive variation between observers .
The quantitative assessment of blasts was treated as a discrete variable with a non-normal distribution; comparisons between observers were performed with Spearman’s Correlation Coefficient (rs). Measurements between observers were also compared using Intraclass Correlation Coefficient (ICC) and the Bland and Altman method .
The D14 BMA evaluation was compared with the BMB (considered as “gold standard”) using receiver operating characteristic (ROC) curves to assess the best cut-off point in terms of sensitivity, specificity and accuracy. The areas under the ROC curves (AUC) were compared using the method of Delong . OS was defined as the time from diagnosis to death of any cause or last follow-up. Survival curves were estimated with the Kaplan-Meier method and differences were compared with the log-rank test. Multivariate analysis for OS was conducted using a Cox model and hazard ratios (HR) were obtained for each observer. All tests were 2-sided, and p values <0.05 were considered statistically significant. Statistical analyses were performed using SPSS 11.0 (SPSS Inc., 1989–2001), MedCalc 11.3 and MH Program 1.2142.
Of 295 patients with AML identified in the hospital records, 119 fulfilled entry criteria. Among these 119 patients who had a BMA on D14, we could recover 107 sets of BMA smears, containing samples of the diagnosis and D14 assessment. The median age was 38 years (range 12–77), 12 % were >60 years-old and 58 % were males. In addition, we were able to compare D14 BMA and BMB in 82 patients.
Agreement analysis between observers
The comparisons between observers of D14 BMA evaluation using the qualitative scale is shown in Table 2. The quadratic weighted kappa coefficient was 0.74 (95 % confidence interval [95 % CI] 0.64 - 0.83, p < 0.001), and no bias was observed (p = 0.8, modified McNemar test). Typical qualitative categories are shown in Fig. 1.
The median blast count on D14 was 4 % and 6 % for observers 1 and 2, respectively, with a Spearman correlation coefficient of 0.798 (p <0.001) (Fig. 2), and an ICC within assessments of 0.836 (95 % CI 0.768 - 0.885, p < 0,001). The average difference between measurements of the percentage of blasts among the observers, according to the Bland and Altman method, was 5.01 % (95 % CI 7.63 - 2.39).
Comparison of bone marrow aspiration and bone marrow biopsy on D14
The evaluation of BMB on D14 showed 33 patients with bone marrow infiltration and 49 free of leukemia. Table 3 shows the distribution of the categories of the qualitative scale according to the BMB status. We observed an association between the categories of definitely free and probably free with leukemia free in the BMB, and the categories of definitely infiltrate and probably infiltrated with infiltrated BMB (85.4 % for observer 1 and 75.6 % for observer 2). Doubtful results of BMA represented mainly leukemia free BMB for both observers.
Figure 3 shows the ROC curves correlating the BMA quantification of blasts and qualitative scale, by both observers, according to BMB results. The AUCs for the quantitative and qualitative assessments were 0.924 and 0.946 for observer 1, and 0.867 and 0.870 for observer 2, respectively. We also compared the ROC curves of the quantitative and qualitative analysis of each observer. The difference in AUCs was 0.025 for observer 1 (p = 0.22) and 0.002 for observer 2 (p = 0.97).
Determining the best cut-off points
The best cut-off points for blast percentage in BMA was 6 % for observer 1 (AUC 0.883, 84.9 % sensitivity, 91.8 % specificity, and 89.9 % accuracy), and 7 % for observer 2 (AUC 0.858, 81.8 % sensitivity, 89.8 % specificity, and 86.6 % accuracy). A similar analysis for the Likert scale showed the best cutoff point as the 4th item of the scale (probably infiltrated) for both observers: AUC 0.898, 87.9 % sensitivity, 91.8 % specificity, and 90.2 % accuracy for observer 1, and AUC 0.818, 69.7 % sensitivity, 93.9 % specificity, and 84.1 % accuracy for observer 2.
Based on the best cut-off point of qualitative assessment, we divided the five categories of the scale in two: “free” and “infiltrated”. The first represents the grouping of categories definitely free, probably free and doubtful, while the second included the categories probably infiltrated and definitely infiltrated. The kappa coefficient for the comparison between observers was 0.66 (95 % CI 0.51 - 0.80, p < 0.001), with no bias per McNemar test (p = 0.1) (Table 4).
Impact of D14 blasts on survival
Five-year OS was significantly longer in patients with <5 % blasts on D14 for both observers (Fig. 4). With Likert scale, a better outcome in patients with lower grades of marrow involvement was also observed (Fig. 5). The same results were obtained among 55 patients in CR who received two or more cycles of intensification (Fig. 6). Other variables detected as prognostic factors by univariate analysis were: age >60 years, year of diagnosis, treatment delay >7 days from diagnosis, presence of comorbidities, previous cardiac disease, hepatomegaly, active bleeding, gastrointestinal infection and FAB subtype M2 (p <0.05) (Table 5).
Predictors of poor outcome (lower OS) by multivariate analysis, with HR obtained respectively for observers 1 and 2, were age >60 years [HR = 4.67 (95 % CI 1.91-11.4) and 4.36 (95 % CI 1.79-10.61)], the presence of active bleeding at diagnosis [HR = 2.37 (95 % CI 1.18-4.74) and 2.05 (95 % CI = 1.01-4.13)] and residual D14 blasts with Likert scale [HR = 1.42 (95 % CI 1.11-1.81) and 1.43 (95 % CI = 1.11-1.92)] (Table 6).
In this study we found substantial agreement between observers using two different methods: a quantitative assessment, with the determination of the percentage of bone marrow blasts, and a qualitative, based on the perception of marrow infiltration. In addition, a cutoff value of 6-7 % of blasts in the quantitative assessment and “probably infiltrated” marrow in the qualitative assessment was established, with good discriminatory power to identify patients with infiltrated BMB. Moreover, we observed a higher OS in patients who obtained higher grades of cytoreduction by day 14 marrow evaluation.
While risk assessment in AML relies mainly on age and cytogenetic profile , the assessment of in vivo chemosensitivity by determining early response to induction therapy is an additional predictive marker. Indeed, this parameter has been used to guide clinicians in deciding for an early second cycle of chemotherapy [13, 28, 29]. However, the type of D14 bone marrow evaluation (BMA, BMB or both) has varied, with some studies relying on BMA [8, 16], others used BMB , and occasionally no clear information was provided [9, 10, 17, 19].
In our study we observed that the qualitative and the quantitative methods were equally predictive of BMB results, with a substantial inter-observer agreement. Bone marrow evaluation by more than one observer has been previously reported [16, 17], but to our best knowledge, our study was the first that reported the assessment of inter-observer agreement.
Another point of controversy is the cutoff values of blast cell percentage in the quantitative assessment of BMA. Different studies have established cutoff values that ranged from 5 % [9, 10, 30, 31], 10 % [8, 9, 17], 15-22 % , and even 40 % . These variations are also present in published Guidelines: <5 % , <5-10 % , <10-15 %  and hypoplasia or aplasia (without defining a numerical value) . We established a cutoff value of 6-7 % (inter-observer variation), which is in the range of previous studies, and identified that the qualitative categories of definitely and probably infiltrated were predictive of residual leukemia on BMB.
All analyzes of response assessment by D14 BMA by both methods (qualitative and quantitative) and both observers resulted in higher specificity than sensitivity. Likewise, the concordance between observers was very good for “definitely/probably infiltrated”, but not so good for “definitely/probably free”. Therefore, there is no debate that a large amount of leukemic blast on day 14 constitutes unequivocal evidence of residual leukemia. However, the presence of a few blasts in a paucicellular or hemodilute marrow sample cannot be considered as definite evidence of residual disease. Indeed, most guidelines determine a second induction cycle for unequivocal residual disease and most dilemmas occurs in patients with low blast count (5-15 %) .
Few previous studies have shown an association between D14 marrow findings and long-term outcome [8, 9, 10, 17, 30]. In the present study, multivariate analysis showed that the evaluation of the bone marrow infiltration by Likert scale (but not the percentage assessment) was significantly associated with poor outcome.
Our study shares the limitations of all retrospective studies. It was not possible to recover D14 BMA and BMB slides from all cases. In addition, survival analysis was performed without the inclusion of well-known prognostic factors such as chromosomal and molecular abnormalities. Finally, we did not analyze the potential effect of the different induction regimens given throughout the study period and the number of entry-patients over the study period. Despite these limitations, we were able to show that BMA may be considered the procedure of choice to assess treatment response on D14 because it provides results immediately, and exhibited good agreement between observers and good correlation with BMB and OS.
We conclude that the assessment of BMA on day 14th of remission induction chemotherapy in patients with AML is a reproducible test with a substantial agreement between observers, both quantitatively and qualitatively, has good correlation with BMB and with OS. The percent cut-off 6-7 % or “probably infiltrated” may help to early identify a population of patients with unfavorable prognosis.
Estey E, Dohner H. Acute myeloid leukaemia. Lancet. 2006;368:1894–907.
Milligan DW, Grimwade D, Cullis JO, Bond L, Swirsky D, Craddock C, et al. Guidelines on the management of acute myeloid leukaemia in adults. Br J Haematol. 2006;135:450–74.
Tallman MS, Gilliland DG, Rowe JM. Drug therapy for acute myeloid leukemia. Blood. 2005;106:1154–63.
Byrd JC, Mrozek K, Dodge RK, Carroll AJ, Edwards CG, Arthur DC, Pretreatment cytogenetic abnormalities are predictive of induction success, cumulative incidence of relapse, and overall survival in adult patients with de novo acute myeloid leukemia: results from Cancer and Leukemia Group B (CALGB 8461). Blood. 2002;100:4325–36.
Grimwade D, Walker H, Oliver F, Wheatley K, Harrison C, Harrison G, The importance of diagnostic cytogenetics on outcome in AML: analysis of 1,612 patients entered into the MRC AML 10 trial. The Medical Research Council Adult and Children’s Leukaemia Working Parties. Blood. 1998;92:2322–33.
Slovak ML, Kopecky KJ, Cassileth PA, Harrington DH, Theil KS, Mohamed A, Karyotypic analysis predicts outcome of preremission and postremission therapy in adult acute myeloid leukemia: a Southwest Oncology Group/Eastern Cooperative Oncology Group Study. Blood. 2000;96:4075–83.
Patel JP, Gonen M, Figueroa ME, Fernandez H, Sun Z, Racevskis J, Prognostic relevance of integrated genetic profiling in acute myeloid leukemia. N Engl J Med. 2012;366:1079–89.
Kern W, Haferlach T, Schoch C, Loffler H, Gassmann W, Heinecke A, Early blast clearance by remission induction therapy is a major independent prognostic factor for both achievement of complete remission and long-term outcome in acute myeloid leukemia: data from the German AML Cooperative Group (AMLCG) 1992 Trial. Blood. 2003;101:64–70.
Wheatley K, Burnett AK, Goldstone AH, Gray RG, Hann IM, Harrison CJ, A simple, robust, validated and highly predictive index for the determination of risk-directed therapy in acute myeloid leukaemia derived from the MRC AML 10 trial. United Kingdom Medical Research Council’s Adult and Childhood Leukaemia Working Parties. Br J Haematol. 1999;107:69–79.
Buchner T, Hiddemann W, Wormann B, Loffler H, Gassmann W, Haferlach T. Double induction strategy for acute myeloid leukemia: the effect of high-dose cytarabine with mitoxantrone instead of standard-dose cytarabine with daunorubicin and 6-thioguanine: a randomized trial by the German AML Cooperative Group. Blood. 1999;93:4116–24.
Cheson BD, Bennett JM, Kopecky KJ, Buchner T, Willman CL, Estey EH. Revised recommendations of the International Working Group for Diagnosis, Standardization of Response Criteria, Treatment Outcomes, and Reporting Standards for Therapeutic Trials in Acute Myeloid Leukemia. J Clin Oncol. 2003;21:4642–9.
O’Donnell MR, Tallman MS, Abboud CN, Altman JK, Appelbaum FR, Arber DA, et al. Acute Myeloid Leukemia (Version 2.2013). J Natl Compr Canc Netw. 2013;11:1047–55.
Morra E, Barosi G, Bosi A, Ferrara F, Locatelli F, Marchetti M, Clinical management of primary non-acute promyelocytic leukemia acute myeloid leukemia: Practice Guidelines by the Italian Society of Hematology, the Italian Society of Experimental Hematology, and the Italian Group for Bone Marrow Transplantation. Haematologica. 2009;94:102–12.
Dohner H, Estey EH, Amadori S, Appelbaum FR, Buchner T, Burnett AK, Diagnosis and management of acute myeloid leukemia in adults: recommendations from an international expert panel, on behalf of the European LeukemiaNet. Blood. 2010;115:453–74.
Preisler HD, Priore R, Azarnia N, Barcos M, Raza A, Rakowski I, Prediction of response of patients with acute nonlymphocytic leukaemia to remission induction therapy: use of clinical measurements. Br J Haematol. 1986;63:625–36.
Liso V, Albano F, Pastore D, Carluccio P, Mele G, Lamacchia M, Bone marrow aspirate on the 14th day of induction treatment as a prognostic tool in de novo adult acute myeloid leukemia. Haematologica. 2000;85:1285–90.
Xiao Z, Xue H, Li R, Zhang L, Yu M, Hao Y. The prognostic significance of leukemic cells clearance kinetics evaluation during the initial course of induction therapy with HAD (homoharringtonine, cytosine arabinoside, daunorubicin) in patients with de novo acute myeloid leukemia. Am J Hematol. 2008;83:203–5.
Hussein K, Jahagirdar B, Gupta P, Burns L, Larsen K, Weisdorf D. Day 14 bone marrow biopsy in predicting complete remission and survival in acute myeloid leukemia. Am J Hematol. 2008;83:446–50.
Yanada M, Borthakur G, Ravandi F, Bueso-Ramos C, Kantarjian H, Estey E. Kinetics of bone marrow blasts during induction and achievement of complete remission in acute myeloid leukemia. Haematologica. 2008;93:1263–5.
Gruppo RA, Lampkin BC, Granger S. Bone marrow cellularity determination: comparison of the biopsy, aspirate, and buffy coat. Blood. 1977;49:29–31.
Bennett JM, Catovsky D, Daniel MT, Flandrin G, Galton DA, Gralnick HR, Proposed revised criteria for the classification of acute myeloid leukemia. A report of the French-American-British Cooperative Group. Ann Intern Med. 1985;103:620–5.
Souto Filho JT, Portugal RD, Loureiro M, Pulcheri W, Nucci M. Characterization and analysis of the outcome of adults with acute myeloid leukemia treated in a Brazilian University hospital over three decades. Braz J Med Biol Res. 2011;44:660–5.
Likert R. A Technique for the Measurement of Attitudes. Archives of Psychology. 1932;140:1–55.
Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33:159–74.
Ludbrook J. Statistical techniques for comparing measurers and methods of measurement: a critical review. Clin Exp Pharmacol Physiol. 2002;29:527–36.
Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Int J Nurs Stud. 2010;47:931–6.
DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44:837–45.
Fernandez HF, Sun Z, Yao X, Litzow MR, Luger SM, Paietta EM. Anthracycline dose intensification in acute myeloid leukemia. N Engl J Med. 2009;361:1249–59.
Ferrara F, Izzo T, Criscuolo C, Riccardi C, Celentano M, Mele G. Day 15 bone marrow driven double induction in young adult patients with acute myeloid leukemia: feasibility, toxicity, and therapeutic results. Am J Hematol. 2010;85:687–90.
Bertoli S, Bories P, Béné MC, Daliphard S, Lioure B, Pigneux A, Prognostic impact of day 15 blast clearance in risk-adapted remission induction chemotherapy for younger patients with acute myeloid leukemia: long-term results of the multicenter prospective LAM-2001 trial by the GOELAMS study group. Haematologica. 2014;99:46–53.
Heil G, Krauter J, Raghavachar A, Bergmann L, Hoelzer D, Fiedler W, Risk-adapted induction and consolidation therapy in adults with de novo AML aged ≤ 60 years: results of a prospective multicenter trial. Ann Hematol. 2004;83(6):336–44.
Pullarkat V, Aldoss I. Prognostic and therapeutic implications of early treatment response assessment in acute myeloid leukemia. Crit Rev Oncol Hematol. 2015;95(1):38–45.
The authors did not receive any funding for this study.
The authors declare that they have no competing interests.
JTSF: data analysis and drafting of article; MML: design, critical revision of article and approval of article; WP: critical revision of article and approval of article; JCM: critical revision of article and approval of article; MN: design, critical revision of article and approval of article; RDP: design, data analysis, critical revision of article and approval of article. All authors read and approved the final manuscript.
About this article
Cite this article
Souto Filho, J.T.D., Loureiro, M.M., Pulcheri, W. et al. Evaluation of bone marrow aspirates in patients with acute myeloid leukemia at day 14 of induction therapy. Diagn Pathol 10, 122 (2015). https://doi.org/10.1186/s13000-015-0365-2
- Acute myeloid leukemia
- Bone marrow
- Blasts counting