In this paper, we report an analysis of digital color images recorded in high-resolution of HL tissue slides. Applying a protocol of CD30 immunostaining to identify malignant cells, we implement a pipeline to handle and explore image data of stained HL tissue images. To the best of our knowledge, this is the first systematic application of image analysis to HL tissue slides. To illustrate the concept and methods we analyze images of two different HL types, nodular sclerosis and mixed cellularity as the most common forms and reactive lymphoid tissue for comparison. We implemented a pipeline which is adapted to the special requirements of whole slide images of HL tissue and identifies relevant regions that contain malignant cells.
Using a preprocessing approach, we separate the relevant tissue region from the background. We assign pixels in the images to one of the six predefined classes: Hematoxylin+, CD30+, Nonspecific red, Unstained, Background, and Low intensity, applying a supervised recognition method. Local areas with pixels assigned to the class CD30+ identify regions of interest. As expected, an increased amount of CD30+ pixels is a characteristic feature of nodular sclerosis, and the non-lymphoma cases show a characteristically low amount of CD30+ stain. Images of mixed cellularity samples include cases of high CD30+ coloring as well as cases of low CD30+ coloring.