Spatial based Expectation Maximizing (EM)
© Balafar; licensee BioMed Central Ltd. 2011
Received: 10 June 2011
Accepted: 26 October 2011
Published: 26 October 2011
Skip to main content
© Balafar; licensee BioMed Central Ltd. 2011
Received: 10 June 2011
Accepted: 26 October 2011
Published: 26 October 2011
Expectation maximizing (EM) is one of the common approaches for image segmentation.
an improvement of the EM algorithm is proposed and its effectiveness for MRI brain image segmentation is investigated. In order to improve EM performance, the proposed algorithms incorporates neighbourhood information into the clustering process. At first, average image is obtained as neighbourhood information and then it is incorporated in clustering process. Also, as an option, user-interaction is used to improve segmentation results. Simulated and real MR volumes are used to compare the efficiency of the proposed improvement with the existing neighbourhood based extension for EM and FCM.
the findings show that the proposed algorithm produces higher similarity index.
experiments demonstrate the effectiveness of the proposed algorithm in compare to other existing algorithms on various noise levels.
The application of image processing techniques for medical imaging process rapidly increases. Most medical images are stored and represented in softcopy . Ultrasound, X-ray computed tomography, digital mammography and magnetic resonance imaging (MRI) are the most common medical imaging types . MRI can give different grey level for different tissues and various types of neuropathology if its acquisition parameters are adjusted .
Data acquisition, processing and visualization techniques facilitate diagnosis. Medical image segmentation plays a very important role in many computer-aided diagnostic tools. These tools could save clinicians' time by simplifying complex time-consuming processes . The main part of these tools is to design an efficient segmentation algorithm. Medical images mostly contain unknown noise , in-homogeneity  and complicated structures. Therefore, segmentation of medical images is a challenging and complex task. Medical image segmentation has been an active research area for a long time. There are many segmentation algorithms but there is no generic algorithm for a totally successful segmentation of medical images .
Clustering methods are common for MRI brain segmentation. Expectation-maximization (EM) and fuzzy c-mean (FCM) are the most popular clustering algorithms. The Gaussian mixture model (GMM) is a popular segmentation method. EM is used to estimate the parameters of this model. FCM and EM only consider the intensity of images and in noisy images, intensity is not trustful [8–10]. Usually, spatially adjacent pixels belong to the same cluster. Many algorithms introduced to make FCM [11–17] and EM robust against noise but nevertheless most of them were and are flawless to some extent. Usually, spatially adjacent pixels belong to the same cluster. Many researchers attempted to incorporate spatial information into FCM and EM to overcome the noise problem. Zhang et. al.  proposed a novel Gaussian hidden Markov Random Field (HMRF) model to integrate spatial information into Gaussian model. They used a Markov Random Field-Maximum A Posteriori (MRF-MAP) approach to estimate the model solution. Recently, Tang et al.  proposed a neighbourhood-weighted Gaussian mixture model to overcome misclassification on the boundaries and on inhomogeneous regions of MRI brain images with noise. A. R. F. d. Silva  proposed two Bayesian algorithms (DPM, rjMCMC) which use Markov chain sampling techniques to find normal mixture models with an unknown number of components. They used algorithms for MRI segmentation and compared performance of their algorithms with published results for two exist Bayesian based MRI brain segmentation methods (KVL , MPM-MAP ).
González Ballester et al.  and Tohka et al.  reported a statistical models namely a novel trimmed minimum covariance determinant (TMCD) for the estimation of the parameters of partial volume models to address partial volume averaging.
In order to make Gaussian mixture model more robust against complex tissue spatial layout, Greenspan et al.  proposed the parameter-tied, constrained Gaussian mixture model (CGMM) to capture this problem. The mixture model composed of a large number of Gaussians for each tissue is used to capture the complex tissue spatial layout. The Gaussian parameters of a tissue are tied using intensity as global feature. The parameters are learned using the expectation-maximization (EM) algorithm.
In , a nonparametric Bayesian model, known as Dirichlet process mixture model (DPMM) is proposed to overcome the limitations of current parametric finite mixture models. The DPMM permits unknown number of components in the mixture and allow robust segmentation of brain with unknown or incomplete specifications.
In , local cooperative unified segmentation (LOCUS) approach based on distributed local MRF models for brain segmentation is presented. The volume is partitioned into sub volumes and a set of local and cooperative Markov random field (MRF) models are distributed. In order to ensure consistency, neighbour local MRFs are estimated cooperatively. The intensity in-homogeneity correction is not required due to precisely fit of Local estimation with the local intensity distribution.
In this paper, a new modification to GMM and EM is introduced by incorporating neighbourhood information into likelihood function and EM steps. The average of neighbour pixels around each pixel is calculated prior to GMM clustering and incorporated in GMM and EM functions beside the pixel value.
The rest of this paper is organized as follows. The standard GMM model and EM segmentation algorithm are presented in Section 2.1. In Section 2.2, proposed modified EM algorithm is described. Also, improvement of segmentation results using use-interaction is presented in section 2.3. Experimental and comparison results are presented in Section 3 and this paper is concluded in Section 4.
A modification to GMM is introduced by incorporating neighbourhood information into likelihood function and EM steps.
Finding the ML solution from this equation is difficult. Usually, the expectation-maximization (EM) is used to obtain the parameters. EM steps are demonstrated in the following:
c. EM steps are repeated until convergence.
The parameter β determines the weight of neighbourhood information. Incorporating neighbourhood information improves the performance of segmentation methods in high level of noise, but the blurring effect degrades the performance of them in low noise level. In order to overcome the degrading effect of algorithms in low level of noise, the variance of noise is used to specify the weight of neighbourhood information (β). Its value is set to σ, where σ is the variance of noise. In previous neighbourhood based EM extensions, neighbourhood information is calculated in clustering iteration; but in this algorithm is computed before iteration, thus, the clustering will be faster. An extension of EM named EM-1 is introduced to solve likelihood function. The EM is modified as follows:
In other words, background pixels are considered as observations (O) and the variance of noise is obtained applying equation 13 on background pixels values. For that, the powers of background pixels values are computed and half of the average of resulted values is considered as variance of the noise.
Also, in-homogeneity correction  is applied to input image with in-homogeneity pollution and the propose GMM is applied on in-homogeneity corrected image.
Sometimes, due to in-homogeneity, low contrast, noise and inequality of content with semantic, automatic methods fail to segment image correctly. Therefore, for these images, it is necessary to use user interaction to correct method's error . However, robust semi-automatic methods can be developed in which user interaction is minimized.
This process continues until user is satisfied. That means quality of segmentation depends on user. Then, to solve problem of several clusters for one tissue, user selects clusters for each tissue (clusters 12 is also selected for grey matter). Steps of this method listed as follow:
1. Input volume is clustered to the n clusters where n is the number of target class (tissues). The output is clustered volume.
2. Under segmentation: If some clusters contain more than one target class (tissue), user selects such clusters to be partitioned more; each user selected cluster is re-clustered to two sub clusters. This process continues till user is satisfied. The output is clustered volume without under segmentation.
3. Over segmentation: If several clusters correspond to one target class (tissue), user selects clusters for each target class. The output is final clustered volume.
The proposed extension of EM (EM-1) and the existing neighbourhood-based extension of EM  (referred as NWEM in this paper for clear understanding) are simulated and tested on the simulated volumes from BrainWeb  and real volumes from Internet Brain Segmentation Repository (IBSR) .
Moreover, reported results on simulated volumes for existing extensions of EM (DPM, rjMCMC, KVL, MPM-MAP) and existing neighbourhood based extension for FCM (FCM_S , FCM_EN , FGFCM , FLICM  and NonlocalFCM ) are used to evaluate proposed algorithm.
Also, the reported results on real volumes from IBSR are used to evaluate proposed algorithms. Furthermore, mentioned FCM extensions simulated and tested on real volumes.
where X i represents class i in ground truth and Y i represents the same class in the segmentation result. Each index for full segmentation results is the average of that index for all classes.
The simulated MRI volumes are obtained from BrainWeb. A simulated data volume with T1-weighted sequence, slice thickness of 1 mm and a volume size of 217 × 181 × 181 is used. Non-brain tissues are removed prior to segmentation.
The number of tissue classes in the segmentation is set to three: grey matter (GM), white matter (WM) and cerebrospinal fluid (CSF). All pixels in the image are contributed in segmentation process but in evaluation process, background pixels are ignored following previous works utilized in this paper. In the public databases which have been used in the paper and generally in brain MRI volumes, background pixels have black value. Therefore, cluster with lowest average grey value is considered as background.
The segmentation results of white matter (WM), grey matter (GM) and cerebrospinal fluid (CSF) are depicted in. (a) is noisy image. (b) is ground-truth. (c) to (d) are the segmentation results of NWEM and EM1, respectively.
From the above qualitative comparison, it was not difficult to find that NWEM was more influenced by the noise in comparison with EM1, in which fewer artefacts were evident, resulting in clearer segmentation result.
Figure 4 shows that EM-1 produces higher similarity indexes and lower rfp and rfn, meaning that this algorithm produces more accurate segmentation results. The similarity index of EM-1 decreases more slowly than NWEM algorithm when noise level increases. In the same time, the rfp and rfn of EM-1 increases faster than NWEM algorithm.
Both algorithms give similar results, under 5% noise level. However, for more than 5%, EM-1 exhibits much better results than the NWEM algorithm. Incorporating average of neighbourhood information, in clustering process of NWEM, make this algorithm robust against noise but has blurring as side effect. It seems that with increasing noise level more than 5% noise level; this incorporation cannot overcome high level of noise.
In , the parameter-tied, constrained Gaussian mixture model (CGMM) is applied on image volume from brainweb with different noise levels. Average similarity index for different algorithms with variant noise levels (3%, 5%, 7%, 9%) are: CGMM (0.93, 0.93, 0.92 and 0.895) and KVL (0.925, 0.915, 0.895 and 0.865). The proposed segmentation algorithm outperforms KVL and CGMM.
It can be seen clearly that proposed algorithm has a better performance over FCM extension methods, and produces more accurate segmentation results. FCM extensions also incorporate neighbourhood information in FCM clustering process, but, it seems that incorporating neighbourhood information improves EM more than FCM method.
In , a novel trimmed minimum covariance determinant (TMCD) method an extension for Gaussian mixture model is applied on 20 normal image volumes from IBSR. The average jaccard value was 0.6722. The average jaccard values for EM-1 is: 0.695. The similarity index for EM-1 is higher than reported result, meaning that EM-1 produces more accurate segmentation results.
In , the parameter-tied, constrained Gaussian mixture model (CGMM) is applied on 18 volumes from 20 normal image volumes (except volume 4-8 and 202-3) in IBSR website. The CGMM results is compared with reported results from the IBSR website, as well as with the Marroquin algorithm . Marroquin's algorithm is an atlas-based Bayesian segmentation algorithm. The CGMM algorithm outperforms other studied methods. Jacc similarity index CGMM was: 0.67. The average jaccard values for EM-1 is: 0.6971. EM-1 outperforms the best reported result which is for CGMM.
In , a nonparametric Bayesian model, known as Dirichlet process mixture model (DPMM) is applied on 13 volumes (1_24, 2_4, 5_8, 6_10, 7_8, 11_3, 12_3, 13_3, 15_3, 16_3, 100_23, 110_3 112_2) from the 20 normal T1-weighted brain image volumes from IBSR. The similarity index for DPMM is higher than competing methods. Dice similarity index for DPMM was: 0.7071. The proposed algorithms are applied on the same volumes. The average Dic value for EM-1 is: 0.8219. The similarity index for proposed method is higher than the best reported result which is for DPMM, meaning that proposed method are the most convincing in segmentation.
In , local cooperative unified segmentation (LOCUS) approach which is based on distributed local MRF models for brain segmentation is applied on the 20 normal T1-weighted brain image volumes from IBSR. LOCUS-T is compared with published results for SPM5 and FAST. Dic similarity index for different methods are: LOCUS-T = 0.765, SPM5 = 0.81, FAST = 0.765. The average Dic value for EM-1 is: 0.8211. EM-1 outperforms the best reported result which is for SPM5.
The similarity index of different algorithms when applied on 20 real volumes
EM-1 with user-interaction
In this paper, an extension of EM has been introduced. In order to overcome the problem of standard EM in the presence of noise, the introduced algorithms are formulated by modifying the equations of the standard EM algorithm which allow the neighbourhood pixels to be incorporated in the labelling of a pixel. Introduced algorithm is tested on simulated MRI volumes, with different noise levels and real volumes. The performance of the existing neighbourhood based EM and FCM algorithms and proposed algorithm are compared qualitatively.
The similarity index, ρ is used to evaluate different algorithms. Experiments demonstrate the effectiveness of the proposed algorithm in compare to other existing algorithms on various noise levels in terms of similarity index, ρ.
In future, we consider doing research on other kinds of segmentation methods to improve their functionalities. Also, we will analyse the effects of different clustering methods in segmentation of medical images for the diagnosis of abnormal or various important matters in medical images.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.