Quality evaluation of virtual slides using methods based on comparing common image areas
© Walkowski and Szymas; licensee BioMed Central Ltd. 2011
Published: 30 March 2011
There are many scanners of glass slides on the market now. Quality of digital images produced by them may be different and pathologists who examine virtual slides on a monitor may subjectively evaluate it. However, objective comparison of quality of digital slides captured by various devices requires assessment algorithms, which will be automatically executed.
In this work such an algorithm is proposed and implemented. It is dedicated for comparing quality of virtual slides which show the same glass slide captured by two or more scanners. In the first step this method looks for the largest corresponding areas in the slides. This task is realized by defining boundaries of tissues and providing the relative scale factor. Then, a certain number of smaller areas, which show the same fragments of both slides, is selected. The chosen fragments are analyzed using Gray Level Co-occurrence Matrix (GLCM). For GLCM matrices some of the Haralick features are calculated, like contrast or entropy. Basing on results for some sample images, features appropriate for quality assessment are chosen. Aggregation of values from all selected fragments allows to compare the quality of images captured by tested devices.
Described method was tested on two sets of ten virtual slides, acquired by scanning the same set of ten glass slides by two different devices. First set was scanned and digitized using the robotic microscope Axioscope2 (Zeiss) equipped with AxioCam Hrc CCD camera. Second set was scanned by DeskScan (Zeiss) with standard equipment. Before analyzing captured virtual slides, images were stitched and converted using software which utilizes advances in aerial and satellite imaging.
The results of the experiment show that calculated quality factors are higher for virtual slides acquired using first mentioned device (Axioscope2 with AxioCam).
Results of the tests are consistent with opinion of the pathologists who assessed quality of virtual slides captured by these devices. This shows that the method has potential in automatic evaluation of virtual slides’ quality.
Virtual slides technology enables pathologists to store slides in computer memory and watch them on a display. However, experts who use virtual slides in their work know that their quality may be different. One of the main factors affecting this quality is a type of device used to capture and digitize slides. Each pathologist may have his or her own opinion about the quality of virtual slides coming from a particular device. The problem is that such assessment is subjective and opinions given by various experts may be different.
In this approach we try to find a method of automatic evaluation of virtual slides’ quality, where human factor in the assessment is minimal. Such a method compares quality factors calculated for virtual slides captured by different devices, but showing the same real slide. Running this method for many slides and a few scanners will allow us to say which device captures virtual slides of better quality.
The main goal of this work is designing, implementing and testing a method which compares quality factors, automatically calculated for corresponding areas of virtual slides.
When designing our method for comparing virtual slides’ quality, several criteria were followed. First of them is relatively fast execution. Size of virtual slides in computer memory is usually big, so the method does not process the whole slide, but rather some of its areas. To achieve it, the algorithm follows another criterion: ability to deal with big digital images, often stored in hundreds of single files. The method can merge them and crop according to requested areas. The third criterion is accuracy, which means producing reliable results. This is done by finding and selecting corresponding areas, which are then evaluated. This approach allows to visually examine quality of selected corresponding areas, which is helpful when assessing method’s results.
The method is divided into the following main steps:
1. Loading virtual slides.
2. Selecting corresponding areas which quality will be compared.
3. Choosing and displaying corresponding fragments of the selected areas.
4. Calculating and aggregating quality measures for the chosen fragments.
In the first step, when loading virtual slides, we assume that they are stored in sets of graphical files of a specified structure. For the test purpose two formats were supported, dedicated for virtual slides used in the experiment. There are several differences between them, like default file names or presence of metadata XML file. Moreover, one slide may be rotated in reference to the second one. All these differences are taken into account in slide loaders, so the format and orientation of all slides in the application are common. The effect of initial virtual slides loading are thumbnails, which are in fact overviews of the whole slides. They are presented to the user in the application. Once created, thumbnail is saved to the file to increase the speed of loading the same virtual slide for the next time.
Area selected by the user may contain even the whole virtual slide, so processing it thoroughly could take a long time. Therefore, only some fragments of the slides are selected for the further analysis and comparison. This step can be justified by the fact that textures on a single slide are relatively monotonous and comparing quality of its fragments gives a good overview of the whole slide’s quality.
Choosing slide’s fragments is done automatically by dividing area selected by user into a grid and selecting cells with random horizontal and vertical indices. Always a certain number of cells, defined by user, is chosen (in tests 10 fragments were selected from the total number of 400 cell). Each selected cell can be described by coordinates relative to the whole selected area. These coordinates are used in each slide to cut appropriate areas from them. Resolutions of the tested virtual slides may be different, so the chosen fragments are resized to the resolution of the smallest slide. If rectangles drawn by user initially on the slides cover the same areas of the real glass slide, captured by different scanners, acquired fragments also contain the same areas. Therefore, they are taken to the further analysis.
GLCM aggregates the numbers of pairs of neighbor pixel values, which occur in the image. Pair of pixels (x, y) increments the cell with row index x and column index y by 1 (Figure 3b). To preserve symmetrical property of the matrix, also cell with row index y and column index x is incremented by 1. Finally, values in GLCM are divided by the sum of values in this matrix to convert the numbers to probabilities of co-occurrence of all possible pixel values.
GLCM operates on a single color channel, so each evaluated image has been converted to grayscale. This step reduces information about colors, but the main factors which decide about quality of an image, like contrast, are preserved. The number of rows and columns in the GLCM equals to the number of possible values in the grayscale. This may increase the computational complexity radically, because for 8-bit grayscale the matrix has 65536 cells. Moreover, counting every little change of pixel values in a line is probably unnecessary when assessing quality, because contrast, which is the main quality factor, depends on sharp edges which are formed by rapid pixel changes. For these reasons the grayscale depth of the chosen areas has been reduced to 5 bits.
Contrast is calculated as the sum of probabilities multiplied by the squared difference of cell indices. This formula assigns smaller weights to probabilities of co-occurrence of pixels which values do not differ significantly (especially 0 for pairs of pixels with equal values) and higher weights to probabilities of rapid changes of the pixel values in a line of the image. If we consider contrast as the sharpness of edges in an image, the contrast equation promotes images which contain many sharp edges.
The second selected measure is entropy. The formula sums up the probabilities multiplied by their natural logarithms, with reversed sign . The idea of entropy comes from thermodynamics, but it is generally considered as a measure of the system’s disorganization. It is expected that entropy calculated for more complex images will be higher. When corresponding areas of virtual slides are compared, higher complexity means probably more details in an image, which results in higher quality.
Finally, because sizes of chosen fragments are equal, the result for the whole slide is calculated as the arithmetic mean of aggregated measures for single fragments (Figure 6c). The final value can be compared with analogous values calculated for other virtual slides, which were used in the test for a single glass slide.
The described method of quality evaluation was tested on two sets of 10 slides scanned using two different devices. One set of slides was scanned and the images were captured and digitized by using a robotic microscope Axioscope2 (Zeiss) equipped with AxioCam Hrc CCD camera (called Axioplan slides in this paper). Second set of the same glass slides was made with the use of commercially available DeskScan (Zeiss) with standard equipment (called Mirax slides). For stitching and conversion we used software based on and utilizing advances in aerial and satellite imaging.
In each test two virtual slides, containing the image of the same glass slide, were loaded. Then, we selected corresponding areas, which quality was examined in further steps. After the quality evaluation process was finished, the results were presented on the display (Figure 2). Firstly, we ensured that corresponding fragments contain the same areas of the whole slides. Then, the results were analyzed and summarized.
Contrast values for tested virtual slides
Entropy values for tested virtual slides
We can see that for all slides values of contrast for Axioplan slides are higher than corresponding values for Mirax slides. In one or two cases the difference is relatively low, but in most cases it is significant. Comparison of entropy for the same set of slides is similar – in all 8 tests entropy of Axioplan slides is higher than entropy of Mirax slides. This means that the quality factors returned by the method described in this paper are higher for Axioplan slides than Mirax slides. Therefore, in this test the set of Axioplan slides is selected by this method as the set of virtual slides of better quality.
These results are consistent with opinions of the pathologists, who claim that quality of virtual slides captured by robotic microscope Axioscope2 equipped with AxioCam camera is better than quality of corresponding virtual slides scanned by DeskScan with standard equipment. The difference is said to be bigger on higher magnification levels, which were examined in these tests. Common fragments, selected for analysis by this method, can be also compared visually, even by non-specialists. Conclusions from such a visual comparison are similar – quality of Axioplan virtual slides is generally better than quality of corresponding Mirax virtual slides.
In further work, this method could be tested on bigger sets of virtual slides, acquired from different scanners. Also other measures, probably more accurate and efficient, could be proposed. Finally, Haralick features calculated from GLCM could be used for automatic detection of common areas in the virtual slides, so that interaction with user would be minimized.
In this paper we proposed a method for automatic evaluation of quality of virtual slides which were created by capturing the same glass slide using different scanning devices. The method allows to select common areas of the virtual slides and then chooses fragments of these areas which are analyzed. Quality is evaluated and compared by acquiring Gray-Level Co-occurrence Matrices from the fragments and aggregating Haralick features calculated from the matrices, like contrast or entropy. The method was implemented and tested on two sets of virtual slides, captured by two different scanning devices. The results of the experiment are consistent with opinions of the pathologists about virtual slides’ quality and with visual assessment of quality of the selected fragments.
This article has been published as part of Diagnostic Pathology Volume 6 Supplement 1, 2011: Proceedings of the 10th European Congress on Telepathology and 4th International Congress on Virtual Microscopy. The full contents of the supplement are available online at http://www.diagnosticpathology.org/supplements/6/S1
- Hall-Beyer M: GLCM Texture: A Tutorial. 2007, [http://www.fp.ucalgary.ca/mhallbey/tutorial.htm]Google Scholar
- Mikkilineni AK, Chiang PJ, Ali GN, Chiu GTC, Allebach JP, Delp EJ: Printer Identification Based on Graylevel Co-occurrence Features for Security and Forensic Applications. Proceedings of Security, Steganography, and Watermarking of Multimedia Contents VII, San Jose, California, USA, January 17-20, 2005. Edited by: Delp EJ, Wong PW. 2005, Proceedings of SPIE, 5681: 430-440.Google Scholar
- Haralick RM, Shanmugam K, Dinstein I: Textural Features for Image Classification. IEEE Transactions on Systems, Man and Cybernetics. 1973, SMC-3 (No. 6): 610-621. 10.1109/TSMC.1973.4309314.View ArticleGoogle Scholar
- Aria EH, Saradjian MR, Amini J, Lucas C: Generalized Cooccurrence Matrix to Classify Irs-1d Images Using Neural Network. International Archives of Photogrammetry Remote Sensing and Spatial Information Sciences. 2004, 35 (Part 7): 117-122.Google Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.