Discriminant analysis of intermediate brain atrophy rates in longitudinal diagnosis of alzheimer's disease
 Ali Farzan^{4}Email author,
 Syamsiah Mashohor^{1, 3},
 Rahman Ramli^{1, 3} and
 Rozi Mahmud^{2}
DOI: 10.1186/174615966105
© Farzan et al; licensee BioMed Central Ltd. 2011
Received: 9 May 2011
Accepted: 28 October 2011
Published: 28 October 2011
Abstract
Diagnosing Alzheimer's disease through MRI neuroimaging biomarkers has been used as a complementary marker for traditional clinical markers to improve diagnostic accuracy and also help in developing new pharmacotherapeutic trials. It has been revealed that longitudinal analysis of the whole brain atrophy has the power of discriminating Alzheimer's disease and elderly normal controls. In this work, effect of involving intermediate atrophy rates and impact of using uncorrelated principal components of these features instead of original ones on discriminating normal controls and Alzheimer's disease subjects, is inspected. In fact, linear discriminative analysis of atrophy rates is used to classify subjects into Alzheimer's disease and controls. Leaveoneout crossvalidation has been adopted to evaluate the generalization rate of the classifier along with its memorization. Results show that incorporating uncorrelated version of intermediate features leads to the same memorization performance as the original ones but higher generalization rate. As a conclusion, it is revealed that in a longitudinal study, using intermediate MRI scans and transferring them to an uncorrelated feature space can improve diagnostic accuracy.
Keywords
Alzheimer's disease diagnostic discriminate analysis neuroimaging whole brain atrophy principal component analysis1. Introduction
Clinical measures for diagnosing AD are traditionally based on two last biomarker and some standard measures such as Mini Mental Score Exam (MMSE), Clinical Dementia Rating (CDR), Functional Assessment Staging Scale (FAST), Global Deterioration Scale (GDS) or Alzheimer's disease Assessment Scale (ADAS) are used to diagnose people with AD clinically. It is obvious that these measures are useful just in the second and third stages of disease and cannot be used in first stage where there is no manifest behavioral or memory impairment [3, 4]. Furthermore, these scores singly are not accurate enough and some complementary biomarkers are needed for accurate diagnosis of AD [4, 5]. The need for monitoring disease progression in designing new therapeutic trials encourages researchers to find noninvasive accurate biomarkers of AD [6, 7]. MR images due to their high resolution and noninvasive nature, are good candidates for realizing degeneration of brain structures and finding strong relationships between them and disease progression [6]. Various anatomical structures of brain such as Entorhinal Cortex [7–9], Hippocampus [10, 11] and Cerebral Cortex [12–14] influenced by AD and their atrophic characteristics such as volume, shape and thickness can be used as biomarkers of AD [6, 12, 15, 16]. Concentrating on atrophic characteristics of anatomical structures is prone to some imperfection. That is, disease related atrophies don't necessarily follow the anatomical boundaries of structures and each part of the brain can be changed under the influence of disease.
There are some methods for measuring brain atrophy in the literature but only three of them are validated. Boundary Shift Integral (BSI) [20, 21], Structural Image Evaluation Using Normalization of Atrophy (SIENA) [22] and cross sectional counterpart of it (SIENAX) [18] are the most accurate and broadly accepted methods for evaluating atrophy rate of the brain. Research shows that SIENA has the same accuracy as BSI and so it is fair to choose any of the abovementioned method in measuring atrophy rate of whole brain in a twoyear longitudinal study. That is, the differences between two measures have no effect on the pathological discrimination power of the method.
To measure the whole brain atrophy rate, the pipeline conducted by Smith and et.al are used in this paper [18, 23–28]. First step in this pipeline is brain surface extraction which separates the brain from other nonbrain parts such as skull or scalp in both images of longitudinal study. To do so, a deformable tessellated mesh have been used which deforms under the control of local parameters and finally matches the brain of head [27]. Afterward, base images must be registered to follow up counterparts. In this step, it was necessary to avoid rescaling artifacts which could change the atrophy size. With this in mind, it has been assumed that the size of skull is constant; it is considered as normalization factor in scaling process. To escape unnecessary modifications of nonlinear registration which matches images as much as possible and eliminates the atrophic differences between them, the linear registration is preferred in this study [26].
Next step is to measure the differences between images. Thus, brain images have been segmented into their three major tissues  Gray Matter (GM), White Matter (WM) and Cerebrospinal Fluid (CSF) [29]. Boundary points of these tissues have been used to measure the difference between images. One 3 by 3 gradient operator was used to find the gradients in these points. In a peer to peer comparison of 3^{mm} intensity profile on these gradients, the shift distance that maximizes the correlation between these profiles have considered as difference measure. Normalized sum of these measures over all boundary points indicates the overall differences between brain volumes and is called Percentage of Brain Volume Change (PBVC) [22].
Magnetic resonance images (MRI) from Alzheimer's disease neuroimaging (ADNI) database are used in this study [30]. Percentage of brain volume change is evaluated between baseline and the 6th month and the 24th month follow up intervals pair wise. These 3 atrophy rates are used as features in discriminate analysis (DA). Because of high degree of correlation between the features, principal component analysis (PCA) is used to convert the feature space to an uncorrelated feature space and at the same time to reduce the size of space. Discriminative power of these features is compared with the original ones.
2. Materials and methods
2.1. Subjects
A total of 30 AD patients (46.7% female; mean age of 75 at the standard deviation of 7), and 30 agematched healthy normal controls (50% female; mean age of 77 at the standard deviation of 5) are selected from the ADNI public database http://www.loni.ucla.edu/ADNI/Data/. ADNI is a large fiveyear study launched in 2004 by the National Institute on Aging (NIA), the National Institute of Biomedical Imaging and Bioengineering (NIBIB), the Food and Drug Administration (FDA), private pharmaceutical companies and nonprofit organizations, as a $60 million publicprivate partnership. The primary goal of ADNI has been to test whether serial MRI, PET, other biological markers, and clinical and neuropsychological assessments acquired at multiple sites (as in a typical clinical trial), can replicate results from smaller single site studies measuring the progression of MCI and early AD. Determination of sensitive and definite markers of very early AD progression is destined to aid researchers and clinicians to monitor the effectiveness of new treatments, and diminish the time and cost of clinical trials. The Principal Investigator of this initiative is Michael W. Weiner, M.D., VA Medical Center and University of California, San Francisco.
All the AD and NC subjects in this study had successfully undergone MRI scanning, cognitive tests and clinical evaluation at baseline, 6^{th} months and 2^{nd} year follow up.
2.2. Statistical analysis
Demographic and clinical variables by diagnostic group
NC(n= 30)  AD(n= 30)  Ρ  Total  

Gender(M/F)  15/15^{a}  16/14  0.796  
Age(M/SD)  77/5^{a}  75/7  0.188  
Years of Education(M/SD)  16.2/2.9^{a}  15.7/2.7  0.554  
Baseline MMSE(M/SD)  29.3/0.8^{b}  23.5/2.2  < 0.00001  
PbvcSc6 (M/SD)  0.36/0.59^{b}  0.98/0.95  0.005  0.67/0.87 
Pbvc624 (M/SD)  1.24/0.89^{b}  3.11/1.23  < 0.00001  2.17/1.43 
PbvcSc24 (M/SD)  1.65/1.05^{b}  4.13/1.85  < 0.00001  2.88/1.95 
These results approve that the two groups are disparate based on longitudinal volume changes, but it does not specify the way of classifying one individual subject into one of these groups based on above features.
DA is a statistical technique used to differentiate groups when the underlying features are quantitative and normally distributed [31]. It is an appropriate method for classifying patterns of subjects into two desired separated groups, AD and NC.
2.3. Discriminant analysis
Normality test of atrophy rates using kolmogorovsmirnov method
NC  AD  

PbvcSc6  0.200*  0.125 
Pbvc624  0.200*  0.200* 
PbvcSc24  0.200*  0.200* 
The simplest and first way to this is using total means of features as threshold values. Patterns with feature values above it will be assigned to one group and the ones bellow it to the other.
Classification based on total mean thresholding
Threshold Value  Sensitivity  Specificity  Accuracy  

PbvcSc6 (M/SD)  0.66752  50%  60%  55% 
Pbvc624 (M/SD)  2.17367  76.66%  83.33%  80% 
PbvcSc24 (M/SD)  2.88472  83.33%  93.33%  88.33% 
cross validation results
Predicted  

NC  AD  
Original  NC  90%  10% 
AD  20%  80% 
correlation coefficients
PbvcSc6  Pbvc624  PbvcSc24  

PbvcSc6  1  0.394  0.749 
Pbvc624  0.394  1  0.899 
PbvcSc24  0.749  0.899  1 
It is clear that PbvcSc24 has high correlation with PbvcSc6 and Pbvc624 and this violates the terms of analysis. To overcome this we use principal component analysis (PCM) to convert them to uncorrelated features. There are two main steps in conducting PCA:

Step 1: Assessment of data suitability
KMO and Bartlett's Test
KMO Measure of Sampling Adequacy  0.221  0.646 

Bartlett's Test of Sphericity  Approx. ChiSquare  292.451 
df  3  
Sig.  < 0.00001 
Factorability of data samples are also confirmed according to these measures. In order for feature relationship to be strong, correlation between features should be at least 0.3 which is at this rate in our case (Table 5).

Step 2: Feature extraction
Parallel analysis
Component  Total Eigenvalues  Random Eigenvalues 

1  2.381  1.1624 
2  .615  0.998 
3  .004  0.8396 
Total Variance Explained
Initial Eigenvalues  Extraction Sums of Squared Loadings  

Component  Total  % of Variance  Cumulative %  Total  % of Variance  Cumulative % 
1  2.381  79.371  79.371  2.381  79.371  79.371 
2  .615  20.492  99.863  
3  .004  .137  100.000 
Regarding to the three abovementioned methods, only one of the features must be selected for discriminating subjects. Referring to the Table 7, it carries 79.371% of total variance among data which seems not satisfactory. Indeed, PCA is used as a data exploration technique, so the interpretation and the way we use it is up to our judgment, rather than any hard and fast statistical rules. Here in this article, it is supposed that the algorithm is interested only in components that have an eigenvalue of 0.6 or more. By extracting two uncorrelated features, with which 99.863% of total variance among data will be carried, which is highly satisfactory.
component matrix
Features  Extracted feature 1 (PC1)  Extracted feature 2 (PC2) 

PbvcSc24  0.997  0.613 
Pbvc624  0.874  0.485 
PbvcSc6  0.789  0.061 
within group CORRELATION MATRIX
Features  PC1  PC2 

PC1  1  0.099 
PC2   0.099  1 
DA can be carried on by these two newly extracted uncorrelated features.
discriminant function at group Centroid
Group  Mean ds 

NC  0.89 
AD   0.89 
Eigenvalues
Function  Eigenvalue  % of Variance  Cumulative %  Canonical Correlation 

1  .820^{a}  100.0  100.0  .671 
Wilks' Lambda
Test of Function(s)  Wilks' Lambda  Chisquare  df  Sig. 

1  0.55  34.124  2  < 0.00001 
Structure Matrix
Group  Mean ds 

PC1  0.927 
PC2   0.466 
3. Results and discussion
classification results
Predicted  

NC  AD  
Original  NC  93.3%  6.7% 
AD  16.7  83.3% 
cross validation results
Predicted  

NC  AD  
Original  NC  93.3%  6.7% 
AD  16.7%  83.3% 
Compared to the generalization results of initially selected features in Table 4, it can be seen that the accuracy of the diagnosis using two extracted uncorrelated features (PC1PC2) improves, compared to PBVCsc24 alone for about 3.33%. It is revealed in Table. 17.
4. Conclusion
Findings of the study disclose that in longitudinal analysis of brain atrophy rate for diagnosing AD subjects, incorporating some intermediate (between baseline and follow up) MRI scans and using their corresponding atrophy rates in uncorrelated form or principal components of them, can improve the accuracy of diagnosis specially from generalization aspect.
In spite of this improvement, linear classifiers cannot discriminate subjects with the highest accuracy expected in the ROC curve. Consequently, nonlinear classifiers such as kernel support vector machine (SVM) must be invoked to achieve a higher accuracy of diagnosis. This is mainly because of nonlinear nature of atrophy rate between the subjects.
Appendix
Cross validation
In kfold crossvalidation, the initial data set is randomly partitioned into k nonoverlapping subsets or "folds" (D_{1}, D_{2}, ... , D_{ k) } each of which with approximately equal size. Training and testing is performed k times. In iteration i, subset D_{ i } is reserved as test set, and the remaining subsets are collectively used to train the model. To put it simple, in the first iteration, subsets D_{2}, ... , D_{ k } are used as the training set in order to obtain a first model, which is tested on D 1; the second iteration is trained on subsets D_{1}, D_{3}, ..., D_{ k } and tested on D_{2}, and so on. For classification, the accuracy estimation is the overall number of correct classifications from the k iterations, divided by the total number of tuples in the initial data.
Leaveoneout is a special case of kfold crossvalidation where k is set to the number of initial tuples. That is, only one sample is left out at a time for the test set.
Principal Component Analysis (PCA)
It is a way of identifying patterns in data, and expressing the data in such a way as to highlight their similarities and differences [38]. The other main advantage of PCA is that once you have found these patterns in the data, you can compress the data by reducing the number of dimension, without much loss of information. This technique is used in feature extraction to reduce feature space dimension and make features more discriminative.
Where P is the original pattern of features and I is the pattern of uncorrelated features. A is the eigenvalue of covariance matrix.
Declarations
Authors’ Affiliations
References
 Suda S, Ueda M, Sakurazawa M, Nishiyama Y, Komaba Y, Katsura KI, et al.: Clinical and neuroradiological progression in diffuse neurofibrillary tangles with calcification. Journal of Clinical Neuroscience. 2009, 16 (8): 11124.View ArticlePubMedGoogle Scholar
 Frisoni G, Fox N, Jack C, Scheltens P, Thompson P: The clinical use of structural MRI in Alzheimer disease. Nature Reviews Neurology. 2010, 6 (2): 6777.PubMed CentralView ArticlePubMedGoogle Scholar
 Ridha B, Anderson V, Barnes J, Boyes R, Price S, Rossor M, et al.: Volumetric MRI and cognitive measures in Alzheimer disease. Journal of neurology. 2008, 255 (4): 56774. 10.1007/s0041500807509.View ArticlePubMedGoogle Scholar
 Fox N, Crum W, Scahill R, Stevens J, Janssen J, Rossor M: Imaging of onset and progression of Alzheimer's disease with voxelcompression mapping of serial magnetic resonance images. The Lancet. 2001, 358 (9277): 2015. 10.1016/S01406736(01)054083.View ArticleGoogle Scholar
 Hua X, Lee S, Yanovsky I, Leow AD, Chou YY, Ho AJ, et al.: Optimizing power to track brain degeneration in Alzheimer's disease and mild cognitive impairment with tensorbased morphometry: An ADNI study of 515 subjects. Neuroimage. 2009, 48 (4): 66881. 10.1016/j.neuroimage.2009.07.011.PubMed CentralView ArticlePubMedGoogle Scholar
 Wang L, Miller JP, Gado MH, McKeel DW, Rothermich M, Miller MI, et al.: Abnormalities of hippocampal surface structure in very mild dementia of the Alzheimer type. NeuroImage. 2006, 30 (1): 5260. 10.1016/j.neuroimage.2005.09.017.PubMed CentralView ArticlePubMedGoogle Scholar
 Liu Y, Paajanen T, Zhang Y, Westman E, Wahlund LO, Simmons A, et al.: Combination analysis of neuropsychological tests and structural MRI measures in differentiating AD, MCI and control groupsThe AddNeuroMed study. Neurobiology of Aging. 2009, Corrected Proof,Google Scholar
 Chetelat G, Desgranges B, Landeau B, Mezenge F, Poline J, De la Sayette V, et al.: Direct voxelbased comparison between grey matter hypometabolism and atrophy in Alzheimer's disease. 2007, BrainGoogle Scholar
 Di Paola M, Macaluso E, Carlesimo G, Tomaiuolo F, Worsley K, Fadda L, et al.: Episodic memory impairment in patients with Alzheimer's disease is correlated with entorhinal cortex atrophy. Journal of neurology. 2007, 254 (6): 77481. 10.1007/s0041500604351.View ArticlePubMedGoogle Scholar
 Morra J, Tu Z, Apostolova L, Green A, Avedissian C, Madsen S, et al.: Validation of a fully automated 3D hippocampal segmentation method using subjects with Alzheimer's disease mild cognitive impairment, and elderly controls. NeuroImage. 2008, 43 (1): 5968. 10.1016/j.neuroimage.2008.07.003.PubMed CentralView ArticlePubMedGoogle Scholar
 Apostolova LG, Mosconi L, Thompson PM, Green AE, Hwang KS, Ramirez A, et al.: Subregional hippocampal atrophy predicts Alzheimer's dementia in the cognitively normal. Neurobiology of Aging. 2010, 31 (7): 107788. 10.1016/j.neurobiolaging.2008.08.008.PubMed CentralView ArticlePubMedGoogle Scholar
 Plant C, Teipel SJ, Oswald A, Böhm C, Meindl T, MouraoMiranda J, et al.: Automated detection of brain atrophy patterns based on MRI for the prediction of Alzheimer's disease. NeuroImage. 2010, 50 (1): 16274. 10.1016/j.neuroimage.2009.11.046.PubMed CentralView ArticlePubMedGoogle Scholar
 Fan Y, Batmanghelich N, Clark CM, Davatzikos C: Spatial patterns of brain atrophy in MCI patients, identified via highdimensional pattern classification, predict subsequent cognitive decline. NeuroImage. 2008, 39 (4): 173143. 10.1016/j.neuroimage.2007.10.031.PubMed CentralView ArticlePubMedGoogle Scholar
 Vemuri P, Gunter J, Senjem M, Whitwell J, Kantarci K, Knopman D, et al.: Alzheimer's disease diagnosis in individual subjects using structural MR images: validation studies. NeuroImage. 2008, 39 (3): 118697. 10.1016/j.neuroimage.2007.09.073.PubMed CentralView ArticlePubMedGoogle Scholar
 Teipel SJ, Born C, Ewers M, Bokde ALW, Reiser MF, Möller HJ, et al.: Multivariate deformationbased analysis of brain atrophy to predict Alzheimer's disease in mild cognitive impairment. NeuroImage. 2007, 38 (1): 1324. 10.1016/j.neuroimage.2007.07.008.View ArticlePubMedGoogle Scholar
 Teipel SJ, Ewers M, Wolf S, Jessen F, Kölsch H, Arlt S, et al.: Multicentre variability of MRIbased medial temporal lobe volumetry in Alzheimer's disease. Psychiatry Research: Neuroimaging. 2010, 182 (3): 24450. 10.1016/j.pscychresns.2010.03.003.View ArticlePubMedGoogle Scholar
 Sluimer JD, Bouwman FH, Vrenken H, Blankenstein MA, Barkhof F, van der Flier WM, et al.: Wholebrain atrophy rate and CSF biomarker levels in MCI and AD: A longitudinal study. Neurobiology of Aging. 2010, 31 (5): 75864. 10.1016/j.neurobiolaging.2008.06.016.View ArticlePubMedGoogle Scholar
 Smith SM, Zhang Y, Jenkinson M, Chen J, Matthews PM, Federico A, et al.: Accurate, Robust, and Automated Longitudinal and CrossSectional Brain Change Analysis. Neuroimage. 2002, 17 (1): 47989. 10.1006/nimg.2002.1040.View ArticlePubMedGoogle Scholar
 Boundy KL, Barnden LR, Katsifis AG, Rowe CC: Reduced posterior cingulate binding of I123 iododexetimide to muscarinic receptors in mild Alzheimer's disease. Journal of Clinical Neuroscience. 2005, 12 (4): 4215. 10.1016/j.jocn.2004.06.012.View ArticlePubMedGoogle Scholar
 Freeborough PA, Woods RP, Fox NC: Accurate Registration of Serial 3D MR Brain Images and Its Application to Visualizing Change in Neurodegenerative Disorders. Journal of computer assisted tomography. 1996, 20 (6): 101222. 10.1097/0000472819961100000030.View ArticlePubMedGoogle Scholar
 Fox N, Freeborough P: Brain atrophy progression measured from registered serial MRI: validation and application to Alzheimer's disease. Journal of Magnetic Resonance Imaging. 1997, 7 (6): 106975. 10.1002/jmri.1880070620.View ArticlePubMedGoogle Scholar
 Smith S, De Stefano N, Jenkinson M, Matthews P: SIENA  Normalised accurate measurement of longitudinal brain change. Neuroimage. 2000, 11 (5 Supplement 1): S659S.View ArticleGoogle Scholar
 Smith SM, De Stefano N, Jenkinson M, Matthews PM: Normalized Accurate Measurement of Longitudinal Brain Change. Journal of computer assisted tomography. 2001, 25 (3): 46675. 10.1097/0000472820010500000022.View ArticlePubMedGoogle Scholar
 Smith SM, Jenkinson M, Woolrich MW, Beckmann CF, Behrens TEJ, JohansenBerg H, et al.: Advances in functional and structural MR image analysis and implementation as FSL. Neuroimage. 2004, 23 (Supplement 1): S208S19.View ArticlePubMedGoogle Scholar
 Jenkinson M, Bannister P, Brady M, Smith S: Improved Optimization for the Robust and Accurate Linear Registration and Motion Correction of Brain Images. Neuroimage. 2002, 17 (2): 82541. 10.1006/nimg.2002.1132.View ArticlePubMedGoogle Scholar
 Jenkinson M, Smith S: A global optimisation method for robust affine registration of brain images. Medical Image Analysis. 2001, 5 (2): 14356. 10.1016/S13618415(01)000366.View ArticlePubMedGoogle Scholar
 Smith S: Fast robust automated brain extraction. Human Brain Mapping. 2002, 17 (3): 14355. 10.1002/hbm.10062.View ArticlePubMedGoogle Scholar
 Zhang Y, Brady M, Smith S: Segmentation of brain MR images through a hidden Markov random field model and the expectationmaximization algorithm. IEEE transactions on Medical Imaging. 2001, 20 (1): 4557. 10.1109/42.906424.View ArticlePubMedGoogle Scholar
 Zhang Y, Brady M, Smith S: Segmentation of brain MR images through a hidden Markov random field model and the expectationmaximization algorithm. Medical Imaging, IEEE Transactions on. 2002, 20 (1): 4557.View ArticleGoogle Scholar
 Jack CR, Bernstein MA, Fox NC, Thompson P, Alexander G, Harvey D, et al.: The Alzheimer's disease neuroimaging initiative (ADNI): MRI methods. Journal of Magnetic Resonance Imaging. 2008, 27 (4): 68591. 10.1002/jmri.21049.PubMed CentralView ArticlePubMedGoogle Scholar
 Han J, Kamber M: Data mining: concepts and techniques. 2006, Morgan KaufmannGoogle Scholar
 Osborne J, Costello A: Sample size and subject to item ratio in principal components analysis. Practical Assessment, Research & Evaluation. 2004, 9 (11): 8Google Scholar
 Gleser L: A note on the sphericity test. The Annals of Mathematical Statistics. 1966, 37 (2): 4647. 10.1214/aoms/1177699529.View ArticleGoogle Scholar
 Kaiser H: A second generation little jiffy. Psychometrika. 1970, 35 (4): 40115. 10.1007/BF02291817.View ArticleGoogle Scholar
 Mykola P, editor: PCAbased Feature Transformation for Classification. Issues in Medical Diagnostics. 2004Google Scholar
 Guo Q, Wu W, Massart D, Boucon C, De Jong S: Feature selection in principal component analysis of analytical data. Chemometrics and Intelligent Laboratory Systems. 2002, 61 (12): 12332. 10.1016/S01697439(01)002039.View ArticleGoogle Scholar
 Jain A, Zongker D: Feature selection: Evaluation, application, and small sample performance. Pattern Analysis and Machine Intelligence, IEEE Transactions on. 2002, 19 (2): 1538.View ArticleGoogle Scholar
 Jolliffe I: Principal component analysis. 2002, Springer verlagGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.