Background: The atrophy of medial temporal lobe structures, such as the hippocampus (HC) and entorhinal cortex, is potentially specific and may serve as early biomarkers of Alzheimer’s disease . In particular, the atrophy of the HC can be used as a marker of AD progression since changes in HC are closely related to changes in cognitive performance of the subject . The evaluation of HC atrophy is usually estimated by volumetric studies on anatomical MRI, requiring a segmentation step that can be very time consuming when done manually. This limitation can be overcome by using automatic segmentation methods. However, the complex relationship between the intensity of the MR signal and the HC boundaries makes it difficult to accurately achieve this task. In the past few years, label fusion methods have demonstrated high performances in anatomical structure segmentation [2-4]. These methods use manual segmentations of anatomical MRIs from a training library to perform the segmentation of a subject. Recently, we have proposed a new nonlocal patch-based label fusion method and demonstrated that this approach can accurately segment the HC on young adults through an efficient fusion of manual segmentations . By taking advantage of the redundancy of information present within the subject’s image, as well as the redundancy across subjects, the patch-based nonlocal means scheme enables the robust use of a large number of samples during estimation. Contrary to classical label fusion working at the structure level, our method deals with a finer level by comparing subparts of the structure (i.e., small 3D cubes). In this study, we propose to validate our nonlocal patch-based method on the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database by segmenting the HC of Cognitively Normal (CN) subjects and patients with early Alzheimer’s Disease (AD). Moreover, we evaluate the impact of the training library composition (i.e., only AD, only CN or a mix of AD and CN) on the segmentation accuracy of our algorithm. Methods: In this study, we used images obtained from the ADNI database (www.loni.ucla.edu/ADNI). This database contains both 1.5T and 3.0T T1-w MRI scans. For our experiments, we randomly selected 10 1.5T MRI baseline scans of CN subjects and 10 1.5T MRI baseline scans of patients with AD. After a linear registration to the stereotaxic space by using the ICBM-152 template as target image, the left and right hippocampi of each selected image were manually segmented by an expert at our centre. The first experiment was designed to evaluate the accuracy of our segmentation method on the ADNI dataset with a group constituting of CN subjects and patients with AD. The accuracy of the label fusion method varies according to the number of templates. In this first experiment we used a training library of 16 templates (8 CN subjects and 8 patients with AD). The second experiment was designed to investigate the impact of the composition of the training library. Three different training populations were built by combining the selected 20 MRI scans. The CN population was composed of 8 normal subjects, the AD population was composed of 8 patients and the mixed AD / CN population was composed of 4 CN subjects and 4 AD patients. This way, the number of templates in the training library was the same for all the training libraries compared. Through a leave-one-out procedure, our automatic segmentation method was applied on each of the 20 MRI scans. During the experiments, the default parameters were used for all the images (i.e., a patch-size of 7x7x7 voxels and a search area of 9x9x9 voxels). The quality of the obtained automatic segmentations was evaluated by estimating the Dice Kappa similarity index. This index measures the overlap between the manual segmentation and the segmentation produced by our patch-based method. Finally, a pair-wise multi-comparison test was used to detect significant differences between segmentation quality according to the training library for both populations CN and AD. Results: For the first experiment, the median Dice Kappa values are presented in Table 1. The segmentation accuracy was significantly better (p-value=0.002) for CN subjects (median k=0.883 for both HC) than for patients with AD (median k=0.838 for both HC). A median Dice Kappa value superior to 0.8 indicates a high correlation between manual and automatic segmentations. A median Kappa value superior 0.88 is similar to the highest published values in literature [2-4]. The difference in segmentation quality between populations might come from two sources. First, the higher anatomical variability of patients with AD makes the segmentation more difficult and may require a larger training library. Second, the smaller volumes of HC of patients with AD, due to the HC atrophy, can negatively bias the Dice Kappa index measure. For the second experiment, Figure 1 and Table 2 show the Dice Kappa similarity index for both studied populations according to the training library composition. As expected, by using smaller training libraries, the segmentation accuracy slightly decreased compared to experiment 1. For both populations, the best median Kappa values were obtained with the mixed CN / AD training library (k=0.875 for CN population and k=0.835 AD population). Moreover, for both populations, results were statistically similar by using a training library with the same property than the population segmented, compared to a mixed training library. When the entire training library is composed of templates with dissimilar property (i.e., segmenting AD subjects with CN templates), the results are significantly degraded; especially in the case of CN population (see left of Fig. 1). These results indicate that a mixed training library, containing a large number of patch variants, is better suited for segmentation than a non-population specific training library. Similarly to the first experiment, segmentation accuracy was significantly lower (p-value=0.004) for patients with AD than for CN subjects. Conclusions: The goals of this study were to validate our patch-based label fusion on ADNI data and to investigate the influence of the composition of the training library. First, we demonstrated that our patch-based method provides high segmentation accuracy for both CN and AD populations despite the small number of involved manual segmentation templates. During this experiment, the size and the anatomical variability of HC in AD population tended to decrease the segmentation accuracy compared to CN population. Second, we showed that the characteristic of the training library has a significant impact on the segmentation accuracy. By using a population specific training library or a mixed training library, better results are obtained. In conclusion, when the subject’s status is unknown, the results of our experiments suggest that using a mixed training library, covering the different scenarios, is the best strategy to adopt.
Alzheimer; hippocampus; patch; label fusion; ADNI
Main Research Area:
Alzheimer's Association International Conference, 2011