# How to measure diagnosis-associated information in virtual slides

- Klaus Kayser
^{1}Email author, - Jürgen Görtler
^{2}, - Stephan Borkenfeld
^{3}and - Gian Kayser
^{4}

**6(Suppl 1)**:S9

https://doi.org/10.1186/1746-1596-6-S1-S9

© Kayser et al; licensee BioMed Central Ltd. 2011

**Published: **30 March 2011

## Abstract

The distribution of diagnosis-associated information in histological slides is often spatial dependent. A reliable selection of the slide areas containing the most significant information to deriving the associated diagnosis is a major task in virtual microscopy. Three different algorithms can be used to select the appropriate fields of view: 1) Object dependent segmentation combined with graph theory; 2) time series associated texture analysis; and 3) geometrical statistics based upon geometrical primitives. These methods can be applied by sliding technique (i.e., field of view selection with fixed frames), and by cluster analysis. The implementation of these methods requires a standardization of images in terms of vignette correction and gray value distribution as well as determination of appropriate magnification (method 1 only). A principle component analysis of the color space can significantly reduce the necessary computation time. Method 3 is based upon gray value dependent segmentation followed by graph theory application using the construction of (associated) minimum spanning tree and Voronoi’s neighbourhood condition. The three methods have been applied on large sets of histological images comprising different organs (colon, lung, pleura, stomach, thyroid) and different magnifications, The trials resulted in a reproducible and correct selection of fields of view in all three methods. The different algorithms can be combined to a basic technique of field of view selection, and a general theory of “image information” can be derived. The advantages and constraints of the applied methods will be discussed.

## Keywords

## Introduction

Virtual microscopy which is the work with virtual slides can be performed in two different manners: 1) interactive virtual microscopy and 2) automated virtual microscopy [1, 2]. Interactive virtual microscopy translates the pathologist’s work with conventional glass slides into the digital world, and leaves all work on the microscope to the pathologist. It includes slide navigation, magnification, illumination, focus, etc. Some digital features might be added, especially the contemporary view of different slides, automated storage of areas of interest (with inbuilt expert consultation), or creation of labels. Automated virtual microscopy tries to transfer as many functions as possible to the computer with the final aim, that the system evaluates and proposes the most likely diagnoses [3–5]. Such a system must translate all items of the pathologist’s work into computerized algorithms. These have not necessarily to work in a fully compatible manner; however, they must contain modules that reflect to the corresponding pathologist’s work [6, 7]. These modules will probably work in a “time sequence order”, and include in addition to statistical procedures and classifiers tools that provide a reproducible and constant image quality, object, structure, and texture related magnifications, image analysis procedures, and field of interest recognition programs.

We want to describe some basic ideas and information recognition algorithms in image analysis that can be used for field of view detection in virtual slides which is the position and size of image compartments that posses the strongest association with the underlying disease.

## Basic assumptions

The pathologist’s work is the evaluation of a diagnosis from a microscopic image, which is an image analysis algorithm in combination with external (clinical) data [1, 8, 9]. The pathologist’s view focuses on specific biological meaningful objects and their spatial arrangement (structure) which include a) normal objects (cells, nuclei, etc.), b) abnormal objects (cancer cells, etc.), c) external objects (bacteria, parasites, silica, etc.), d) preserved structure with abnormal cellular societies (inflammatory infiltrates, fibrosis, etc.), e) destroyed structure (granuloma, necrosis, etc.), and f) abnormal structures (adenocarcinoma, sarcomatous growth pattern) [10–13]. A diagnosis from a histological image can be evaluated by recognition and classification of the objects, the formed structures, and their spatial arrangement. It is useful to introduce different levels of structures in order to describe for example the infiltration of lymphocytes into a vascular wall (a vessel would be of higher order compared to a lymphocyte because a vessel is built by a cellular sociology including endothelial cells, smooth muscle cells, a basal membrane, etc.). The details of this concept have been described in Kayser et al. [10, 14–16].

The term information is derived from the latin word informare which means “create by teaching”, in other words a communication procedure between a source (image) and the (understanding) receiver (pathologist). Shannon has analyzed the specific conditions of information transfer and content [17, 18]. According to his theory information limits the broad variety of reactions of an (understanding) receiver to only one or a few appropriate ones. In other words, information is a statistical property and can be analyzed by statistical methods. Shannon introduced the term entropy as principle measure of information, which is derived from classic thermodynamics [17, 18]. Entropy is a measure of the distance of a statistical population from its end stage using Kolmogorov’s axiomatic approach of non-overlapping elementary events that are characterized by a probability 0 < p < 1.

Entropies (E = ∑{pi * ln(pi)}) of different systems can be simply added (Es = ∑(Ei), if there exists no correlation between the elements of the different systems (so called strong chaos), otherwise the more general term of Tsallis entropy has to be used (Es = ∑(Eq1+Eq2) + (1-q)*Eq(1)*Eq(2)) [19–21].

## Macro- and microstages

The calculated Shannon entropies of the macrostages within the system (This is Isis) result in:

This: [-0.92]; is: [-0.46]; Isis: [- 0.64]; ∑ = - 2.02

that of the total system without macrostages {this is Isis} = -1.58,

and based upon the marcostages alone {[this] [is] [Isis]} = -1.08

The calculated probability of the macrostages based upon their internal entropies results in:

P(this) = (1.92)/5.02 = 0.38

P(is) = (1.46)/5.02 = 0.29

P(isis)= (1.64)/5.02 = 0.33

This is Isis: E = {0.38*ln 0.38 + 0.29*ln 0.29 + 0.33 * ln 0.33 = - 1.09

The differences between the macrostages are: [-0.46] + [+0.22] = - 0.18.

If we transform the sequence into the question:

Is this Isis? we will get: [+0.46] + [-0.28] = + 0.18.

The calculation of the total entropy of the (macrostage) system depends upon its structure, or, in other words, the calculation of macrostage entropies can be applied in relation to internal structures, such as sequential arrangement or spatial relationships [16, 22, 23].

### Entropy calculation in relation to histological images (virtual slides)

## How to refine the entropy approach?

### Definition of image associated macro- and microstages

All (interactive) diagnostic information of a digitized image is derived from biological meaningful objects such as cells, nuclei, mitoses, vessels, etc. In other words, an analysis of the image information results in a meaning, which is a probability function of the (predefined) diagnoses and the image information. The higher the probability the more accurate is the diagnosis. The advantage of such an algorithm is the “relatively” constancy of objects (and derived information) compared to the broad variations of images belonging to the same diagnosis [3, 8].

We can consider that image information is an entity that is primarily separated from the set of diagnoses. This theory induces that image information can be described as a mapping of diagnoses M(D) on the image pixels {p(x,y,g)} with

M({Di},P) -> p{px(x,y,g)} with p{px(x,y,g)} = maximum

for the (evaluated) diagnosis D.

Using the entropy approach we create a n-dimensional space of elementary image events and analyse the distribution in the different diseases or macrostages. It would be of formal advantage, if we could define certain elementary events that are independent from the associated meaning, i.e. independent from external knowledge. In fact, this is possible by application of stochastic geometry which has been described by Stoyan et al [24].

Naturally, one could use the pixels as elementary events and associated spectral functions in order to create the set of elementary events. However, this approach would leave us again with the problem of handling broad image variance and low probability levels.

The basic elements (or image primitives) can also be calculated by introduction of a (spatial) relationship function. It is usually called neighbourhood condition, such as Voronoi’s or O’Callaghan’s condition [25–27]. The simplest case is a neighbourhood function f(x,y) with

F(0,1) = 1 iff g(x,y)>threshold, and g(x+1,y)>threshold, or g(x,y+1)>threshold, i.e., two pixels are neighbors iff both of them posses a gray value above a certain predefined threshold (or within a predefined bandwidth of gray values). Naturally, a negative definition can also be applied (<threshold).

This definition allows us to define a set of primitive elements, that form an object, i.e. an elementary element of image information (object, structure, texture).

The different primitive elements include

Isolated points (i.e. pixels without neighbors)

Fibers (pixels possessing a “line” of neighboring pixels, and different start and end pixel

Circles (pixels possessing a line of neighboring pixels, and identical start and end pixel

Plateaus (a set of pixels with a number of neighboring pixels>2 and connected points or lines).

Any biological meaningful object can be broken down to a set of these four primitives; for example a membrane consists of a line or a circle, a nucleus of a circle and at least one plateau, a non-completely segmented nucleus of lines, points, and plateaus, etc. .

## Implementation

## Discussion

The development of reliable and practice oriented scanners which scan whole glass slides has opened a new door in diagnostic surgical pathology or tissue – based diagnosis [1, 3, 5, 6, 29–31]. The mechanical and optical problems are in so far solved as the new canner generation can be successfully implemented into the workflow of routine diagnosis [9]. The next step waits for opening new and attractive functions of these systems. These will include the mandatory replacement and improvement of classic microscope handling, the implementation of new viewing and measurement functions, as well as the search for automated diagnosis systems. These will probably start with the implementation of so-called assistants that will guide the pathologist through all the possible tools. As in all such trends, the final aim would probably be an automated diagnosis system, which the pathologist has to control, and which might at a very later stage control itself.

In this article we describe only one of the possible manners to build and to implement such a system. Other algorithms have been successfully tested too [32–34]. The main idea is that we try to separate different functions that are used in the pathologist’s thinking and diagnostics, and not to be confused with the contemporary application of algorithms that are in principle separated. When in the Middle Ages some genius persons tried to directly copy the flight of birds, they failed because they did not separate the upstream forces from the velocity (forward) movement. The separation of both forces induced the successful development of airplanes that have thought to be never become reality in the past.

We have shown the reader an approach that in a similar manner separates the information given in an image, and its evaluation and interpretation based upon known classification of information (diseases) by a pathologist. Having finally tested the approach, a more generalized theory of performing information into knowledge and competence in virtual microscopy is indicated.

## Acknowledgement

The financial support of the Verein zur Förderung des biologisch technologischen Fortschritts in der Medizin e.V. gratefully acknowledged.

This article has been published as part of *Diagnostic Pathology* Volume 6 Supplement 1, 2011: Proceedings of the 10th European Congress on Telepathology and 4th International Congress on Virtual Microscopy. The full contents of the supplement are available online at http://www.diagnosticpathology.org/supplements/6/S1

## Declarations

## Authors’ Affiliations

## References

- Kayser K, Molnar B, Weinstein RS: Virtual Microscopy - Fundamentals - Applications - Perspectives of Electronic Tissue - based Diagnosis. 2006, VSV Interdisciplinary Medical PublishingGoogle Scholar
- Weinstein RS: Innovations in medical imaging and virtual microscopy. Hum Pathol. 2005, 36 (4): 317-9. 10.1016/j.humpath.2005.03.007.View ArticlePubMedGoogle Scholar
- Kayser K, et al: Towards an automated virtual slide screening: theoretical considerations and practical experiences of automated tissue-based virtual diagnosis to be implemented in the Internet. Diagn Pathol. 2006, 1: 10-10.1186/1746-1596-1-10.PubMed CentralView ArticlePubMedGoogle Scholar
- Kepper N: Visualization, Analysis, and Design of COMBO-FISH Probes in the Grid-Based GLOBE 3D Genome Platform. Stud Health Technol Inform. 159: 171-80.Google Scholar
- Marchevsky AM, et al: The use of virtual microscopy for proficiency testing in gynecologic cytopathology: a feasibility study using ScanScope. Arch Pathol Lab Med. 2006, 130 (3): 349-55.PubMedGoogle Scholar
- Merk MR, Knuechel R, Perez-Bouza A: Web-based virtual microscopy at the RWTH Aachen University: Didactic concept, methods and analysis of acceptance by the students. Ann Anat.Google Scholar
- Schrader T, et al: The diagnostic path, a useful visualisation tool in virtual microscopy. Diagn Pathol. 2006, 1: 40-10.1186/1746-1596-1-40.PubMed CentralView ArticlePubMedGoogle Scholar
- Kayser K, et al: Digitized pathology: theory and experiences in automated tissue-based virtual diagnosis. Rom J Morphol Embryol. 2006, 47 (1): 21-8.PubMedGoogle Scholar
- Lundin M, et al: A European network for virtual microscopy--design, implementation and evaluation of performance. Virchows Arch. 2009, 454 (4): 421-9. 10.1007/s00428-009-0749-3.View ArticlePubMedGoogle Scholar
- Bartels P, Weber J, L D: Machine learning in quantitative histopathology. Anal Quant Cytol Histol. 1988, 10: 299-306.PubMedGoogle Scholar
- Bartels PH, Vooijs GP: Vooijs: Automation of primary screening for cervical cancer. Sooner or later?. Acta Cytol. 43 (1): 7-12.Google Scholar
- Gabril MY, Yousef GM: Informatics for practicing anatomical pathologists: marking a new era in pathology practice. Mod Pathol. 23 (3): 349-58. 10.1038/modpathol.2009.190.Google Scholar
- Giansanti D: Virtual microscopy and digital cytology: state of the art. Ann Ist Super Sanita. 46 (2): 115-22.Google Scholar
- Kayser K: Application of structural pattern recognition in histopathology, in Syntactic and structural pattern recognition, T.P. Edited by: G. Ferraté, A. Sanfeliu, H. Bunke. 1988, Springer: Berlin Heidelberg New York, 115-135.Google Scholar
- Kayser K, et al: Application of attributed graphs in diagnostic pathology. Anal Quant Cytol Histol. 1996, 18 (4): 286-92.PubMedGoogle Scholar
- Kayser K, et al: AI (artificial intelligence) in histopathology--from image analysis to automated diagnosis. Folia Histochem Cytobiol. 2009, 47 (3): 355-61. 10.2478/v10042-009-0087-y.PubMedGoogle Scholar
- Prigogine I: Introduction to Thermodynamics of Irreversible Processes. 1961, New York: John Wiley & Sons Inc, 2ndGoogle Scholar
- Shannon C: A mathematical theory of communication. Bell Sys Tech J. 1948, 27: 379-423.View ArticleGoogle Scholar
- Pincus S: Approximate entropy as a measure of system complexity. Proc Natl Acad Sci U S A. 1991, 88: 2297-2301. 10.1073/pnas.88.6.2297.PubMed CentralView ArticlePubMedGoogle Scholar
- Tsallis C: Entropic nonextensivity: a possible measure of complexity. Chaos, Solitons and Fractals. 2002, 13: 371-391. 10.1016/S0960-0779(01)00019-4.View ArticleGoogle Scholar
- Tsekouras GA, Tsallis C: Generalized entropy arising from a distribution of q indices. Phys Rev E Stat Nonlin Soft Matter Phys. 71: 46-144.Google Scholar
- Kayser K, Kayser G, Metze K: The concept of structural entropy in tissue-based diagnosis. Anal Quant Cytol Histol. 2007, 29 (5): 296-308.PubMedGoogle Scholar
- Voß K: Statistische Theorie komplexer Systeme I. EIK. 1960, 3: 239-244.Google Scholar
- Stoyan D, Kendall WS, Mecke J: Stochastic Geomatry and its Pllications. 1987, Berlin: Akademie verlagGoogle Scholar
- O'Callaghan J: An alternative definition for neighborhood of a point. IEEE Trans Comput. 1975, 24: 1121-1125.View ArticleGoogle Scholar
- Voronoi G: Nouvelles applications des paramêtres continus à la théorie des formes quadratiques, dexièmes memoire: recherches sur les parallèloedres primitifs. J Reine Angew Math. 1902, 134: 188-287.Google Scholar
- Zahn C: Graph-theoretical methods for detecting and describing graph clusters. IEEE Trans Comput. 1971, C-20: 68-86. 10.1109/T-C.1971.223083.View ArticleGoogle Scholar
- Kayser K: Analytical Lung Pathology. 1992, Heidelberg, new York: SpringerView ArticleGoogle Scholar
- Kayser K, Kayser G: Virtual Microscopy and Automated Diagnosis., in Virtual Microscopy and Virtual Slides in Teaching, Diagnosis and Research., R.O. Edited by: J. Gu. 2005, Taylor & Francis: Boca RatonGoogle Scholar
- Kumar RK, et al: Virtual microscopy for learning and assessment in pathology. J Pathol. 2004, 204 (5): 613-8. 10.1002/path.1658.View ArticlePubMedGoogle Scholar
- Yang L, et al: Virtual microscopy and grid-enabled decision support for large-scale analysis of imaged pathology specimens. IEEE Trans Inf Technol Biomed. 2009, 13 (4): 636-44. 10.1109/TITB.2009.2020159.PubMed CentralView ArticlePubMedGoogle Scholar
- Apfeldorfer C, et al: Object orientated automated image analysis: quantitative and qualitative estimation of inflammation in mouse lung. Diagnostic Pathology. 2008, 3 (Suppl 1): S16-10.1186/1746-1596-3-S1-S16.PubMed CentralView ArticlePubMedGoogle Scholar
- Oger M, et al: Automated region of interest retrieval and classification using spectral analysis. Diagnostic Pathology. 2008, 3 (Suppl 1): S17-10.1186/1746-1596-3-S1-S17.PubMed CentralView ArticlePubMedGoogle Scholar
- Gilbertson J, Yagi Y: Histology, imaging and new diagnostic work-flows in pathology. Diagnostic Pathology. 2008, 3 (Suppl 1): S14-10.1186/1746-1596-3-S1-S14.PubMed CentralView ArticlePubMedGoogle Scholar

## Copyright

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.