Homology-based method for detecting regions of interest in colonic digital images
© Nakane et al.; licensee BioMed Central. 2015
Received: 24 September 2014
Accepted: 4 March 2015
Published: 24 April 2015
A region of interest (ROI) is a part of tissue that contains important information for diagnosis. To use many image analysis methods efficiently, a technique that would allow for ROI identification is required. For the colon, ROIs are characterized by areas of stronger color intensity of hematoxylin. Since malignant tumors grow in the innermost layer, most ROIs will be located in the colonic mucosa and will be an accumulation of tumor cells and/or integrated cells with distorted architecture.
Using homology theory, our group proposed a method to estimate the contact degree of elements in a unit area of tissue. Homology is a concept that is used in many branches of algebra and topology, and it can quantify the contact degree. Due to the lack of contact inhibition of cancer cells, an area with unusual contact degree is expected to be a potential ROI.
The current work verifies the accuracy of this method against the results of pathological diagnosis, based on 1825 colonic images provided by the Osaka Medical Center for Cancer and Cardiovascular Diseases. Although we have many false positives and there is a possibility of missing undifferentiated types of cancer, this system is very effective for detecting ROIs.
The mathematical system proposed by our group successfully detects ROIs and is a potentially useful tool for differentiating tumor areas in microscopic examination very quickly. Because we use only the information from low-power field images, there is room for further improvement. This system could be used to screen for not only colon cancer but other cancers as well. More sophisticated and more efficient automated pathological diagnosis systems can be developed by integrating various techniques available today.
The virtual slide(s) for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/7129390011429407.
KeywordsPathology Colon cancer Computer-assisted diagnosis The Betti numbers Histology
Building a reliable computer-assisted pathological diagnosis system will help reduce the burden on pathologists. Various methods have been proposed, but cancer tissue is difficult to recognize because of its complex morphology. Moreover, with the development of virtual slides, biopsy samples can be easily digitized. The amount of data to be processed has increased significantly, but current systems for processing enormous databases are expensive and obtaining numerical results is time-consuming.
A region of interest (ROI) is a part of tissue that contains important information for diagnosis. Detailed and efficient numerical results could be obtained if there was a way to combine established image analysis methods to identify ROIs from a whole-slide image quickly. In a typical case, tumor cells have hyperchromatic nuclei that include condensation of heterochromatin, which can be stained with hematoxylin . Furthermore, malignant tumors grow in the innermost layer; therefore, most ROIs will be located in the colonic mucosa and will be an accumulation of tumor cells and/or integrated cells with distorted architecture. Hence, we suppose that ROIs are characterized by areas of stronger color intensity of hematoxylin.
Concept of our algorithm
Lesions can be considered areas with “different contact degrees”. Since homology is a mathematical tool to quantify “the contact degree”, it is possible to apply this idea to detect a lesion area in a digital image.
Advantage of the proposed method, 1: Topological invariant
Because tissue composition is nonuniform, applying pattern recognition methods is extremely difficult. There is a concept in homology theory called the topological invariant, and it represents a quantity that is unchangeable by continuous transformation. The Betti numbers are topological invariants. By applying this concept, the numerical results in the proposed method remain uninfluenced by slight differences in shape.
Advantage of the proposed method, 2: Average in the unit area
Localized differences are inevitable in living tissue. The proposed method is able to evaluate the calculation results in each unit area; therefore, the results are not affected by this localized difference.
There are no specific criteria for defining the size of a unit area. It is believed that the unit area size will depend on the characteristics of a given tissue.
Colonic specimens were provided by the Osaka Medical Center for Cancer and Cardiovascular Diseases. They included biopsy, endoscopic mucosal resection and surgical specimens. Data were gathered for internal quality control on a routine basis and all patients gave informed consent for data collection. This study was approved by the institutional review board (IRB- the Osaka Medical Center for Cancer and Cardiovascular Diseases). They were stained with hematoxylin and eosin and scanned by a Nano-Zoomer 2 (Hamamatsu Photonics K. K.). The WSIs (whole-slide images) obtained from this virtual slide were divided into several bitmap images. These colonic images (magnification: 100, total images: 1825) were processed using a conventional laptop computer (Dell Vostro, Intel Core i7-3632QM, 2.20 GHz, 4.00 GB).
The binarize parameter is determined automatically from the RGB (red-green-blue) information for each image. Because binarized images are mathematical objects, the b1 value can be calculated. CHomP  was used to obtain numerical results.
Using a laptop computer, the process takes approximately 2.0–3.0 seconds per image. Since the system that was used has not been parallelized, the computations can be faster.
Generally, the tissues (structures) are constructed by the contact between the components. Because this method calculates the contact degree of the tissues (structures), we can apply it to many fields (cf. [7-13]).
Contingency table as the cancer detector
Pathologists are typically able to identify the cancerous region immediately. However, as a screening system, automatic detection of the region that contains the data required for pathological diagnosis (i.e., the ROI) would be useful. Thus, we herein confirm whether this system is effective in detecting the ROI.
An itemized list of false positives
Cross sections of inclined glands
Contingency table as the ROI detector
There are several approaches in the literature for automatic detection of colon cancer in digital tissue images. Altunbay et al. introduced four different approaches, namely, morphological, intensity-based, textural, and structural approaches . The morphological approaches use classical geometrical properties such as size, area, and perimeter in tissue quantification. However, there is a difficult segmentation problem with these approaches because of the complexity of tissue images. The intensity-based approaches use gray level or color intensities of pixels, and calculate a histogram and define an average, standard deviation, entropy, and so on. However, similar color distributions of hematoxylin–eosin stain make these approaches difficult. The textural approaches use texture on pixels, and so are easily affected by artificial noise. Rathore et al. categorized these approaches from a different perspective into three techniques, namely, texture analysis, object-oriented texture analysis, and spectral analysis . Their assessment revealed that none of the techniques is perfect.
In this paper, we introduced a completely different approach, that is, a homology method, using topological invariants—the Betti numbers. Our method accurately detects atypical epithelia regarded as carcinoma and high-grade adenomas that have a high nuclear-cytoplasmic ratio. The microscopic images of these tissues show increased contact between tumor cell nuclei due to their enlargement and pseudo-stratification. Consequently, the Betti numbers of these tissues are increased.
The epithelial tissues showing false positives classified as ROI all share a common trait in the form of enlarged, elongated nuclei and, occasionally, increased chromatin. In the microscopic images, the nuclei of these tissues all exhibit increased contact. That is why setting an algorithm to detect the ROI by computer results in these tissues being detected as positives. Put differently, our technique correctly detected atypical epithelia as a ROI candidate.
Conversely, neoplastic atypia for which the nuclear-cytoplasmic ratio was not particularly highly seen in low-grade adenoma, non-neoplastic, regenerative atypia, and proliferative zone were detected as false positives. A new algorithm needs to be added to identify these components.
To reduce the number of non-ROI, it is necessary to distinguish the cross sections of inclined glands. The pathologists typically make a differential assessment while subconsciously considering the global tissue structure, and will therefore assess these components as negative. The question of how to integrate this thinking into an algorithm is a matter that requires further deliberation. Furthermore, establishing a method to distinguish between neoplastic atypia and non-neoplastic atypia (regenerative atypia and proliferative zone) may lead to the development of a more practical tool. It is essential to discern whether the increase in contact was characterized by a constant nuclear polarity, in other words the same alignment, or by nuclei with disordered polarity and irregular alignment.
We obtained our results using only low-power microscopy. If conglomerations appear in the chromatin of tumor cells, topological invariants would be changed in the nucleic region. Using our method in combination with high-power microscopy would improve specificity. For detecting the area of undifferentiated carcinoma, we should use a specialized pattern recognition technology. Although we have assessed only colonic images, our system could be used to screen for not only colon cancer but other cancers as well. In addition, we have not identified the value of the homology with the convalescence. Because our method can be used to index cancer tissue, we can link the results with other pathological data. This will be done in a future study.
The proposed mathematical system successfully detects ROIs and is a potentially useful tool for differentiating tumor areas in microscopic examination. By combining this newly introduced method and other approaches, we expect further improvements in the automatic detection of colon cancer.
We would like to take this opportunity to thank Dr. Nagumo (Research Professor of Graduate School of Medicine, Osaka University) for her valuable advice regarding pathology. This work was supported by JSPS KAKENHI Grant-in-Aid for Scientific Research (B) Grant Number 26310209.
- Fischer AH, Jacobson KA, Rose J, Zeller R. Hematoxylin and eosin staining of tissue and cell sections. CSH Protoc. 2008;2008:prot4986.PubMedGoogle Scholar
- Nakane K, Tsuchihashi Y. A simple mathematical model utilizing a topological invariant for automatic detection of tumor areas in digital tissue images. Diagn Pathol. 2013, 8 (Suppl 1). doi:10.1186/1746-1596-8-S1-S27.Google Scholar
- Hibi T. Algebraic combinatorics on convex polytopes. Glebe, Australia: Carslaw Publications; 1992.Google Scholar
- Herzog J, Hibi T. Monomial Ideals. Springer--Verlag, 2010.Google Scholar
- Alexandrov PS. Combinatorial Topology. New York: Dover; 1998.Google Scholar
- CHomP [http://chomp.rutgers.edu/Project.html].
- Nakane K, Mizobe K, Santos EC, Kida K. The Quantization of the structure of fisheyes via homology method. Appl Mech Mat. 2013;307:409–14.Google Scholar
- Nakane K, Mizobe K, Santos EC, Kida K. Topological difference of grain composition in the WMZ (Weld Metal Zone) in low carbon steel Plates (JIS-SS400). Adv Mater Res. 2013;566:399–405. Trance Tech Publications, ISSN: 1022–6680.Google Scholar
- Nakane K, Kida K, Mizobe K. Homology analysis of prior austenite grain size of SAE52100 bearing steel processed by cyclic heat treatment. Adv Mater Res. 2013;813:116–9.Google Scholar
- Nakane K, Mizobe K, Kida K. Homology estimate of grain size measurement based on the JIS samples. Appl Mech Mater. 2013;372:116–9.Google Scholar
- Nakane K, Kida K, Honda T, Mizobe K. Influence of repeated quenching on bearing steel martensitic structure investigated by homology. Appl Mech Mater. 2013;372:270–2.Google Scholar
- Nakane K, Mizobe K, Santos EC, Kida K. Quantitative estimates of repeatedly quenched high carbon bearing steel. Appl Mech Mater. 2013;372:273–6.Google Scholar
- Nakane K, Santos EC, Honda T, Mizobe K, Kida K. Homology analysis of structure of high carbon bearing steel: effect of repeated quenching on prior austenite grain size. Mater Res Innov. 2014;18:33–7.Google Scholar
- Altunbay D, Cigir C, Sokmensuer C, Gunduz-Demir C. Color graphs for automated cancer diagnosis and grading. IEEE Trans Biomed Eng. 2010;57(3):665–74.PubMedGoogle Scholar
- Rathore S, Hussain M, Ali A, Khan A. A recent survey on colon cancer detection techniques. IEEE/ACM Trans Comput Biol Bioinform. 2013;10(3):545–63.PubMedGoogle Scholar
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.