Large image modality labeling initiative using semi-supervised and optimized clustering
详细信息    查看全文
  • 作者:Szilárd Vajda ; Daekeun You ; Sameer Antani…
  • 关键词:Semi ; automatic image annotation ; Medical image modality detection ; Unsupervised clustering
  • 刊名:International Journal of Multimedia Information Retrieval
  • 出版年:2015
  • 出版时间:June 2015
  • 年:2015
  • 卷:4
  • 期:2
  • 页码:143-151
  • 全文大小:1,596 KB
  • 参考文献:1.Chatzichristofis SA, Boutalis YS (2008) Cedd: color and edge directivity descriptor: a compact descriptor for image indexing and retrieval. In: Proceedings of the 6th international conference on computer vision systems, ICVS-8Springer. Berlin, Heidelberg, pp 312-22
    2.Foundation AS. http://?lucene.?apache.?org/?core/?index.?html
    3.Fritzke B (1995) A growing neural gas network learns topologies. In: Tesauro G, Touretzky DS, Leen TK (eds) Advances in neural information processing systems, vol 7. MIT Press, Cambridge, pp 625-32
    4.He J, Tan AH, Tan CL, Sung SY (2003) On quantitative evaluation of clustering systems. Kluwer Academic Publishers, Boston
    5.Jain AK (2010) Data clustering: 50 years beyond k-means. Pattern Recogn Lett 31(8):651-66View Article
    6.Kahn CE, Rubin DL (2009) Automated semantic indexing of figure captions to improve radiology image retrieval. J Am Med Inform Assoc 16:380-86View Article
    7.Kohonen T, Schroeder MR, Huang TS (eds) (2001) Self-organizing maps, 3rd edn. Springer-Verlag New York Inc, Secaucus
    8.Krishnamachari S, Yamada A, Abdel-Mottaleb M, Kasutani E (2000) Multimedia content filtering, browsing, and matching using MPEG-7 compact color descriptors. In: Laurini R (ed) Advances in visual information systems, vol 1929., Lecture notes in computer scienceSpringer, Berlin Heidelberg, pp 200-11View Article
    9.Kuncheva LI (2004) Combining pattern classifiers: methods and algorithms. Wiley-Interscience, New YorkView Article
    10.Li J, Mouchère H, Viard-Gaudin C (2014) An annotation assistance system using an unsupervised codebook composed of handwritten graphical multi-stroke symbols. Pattern Recogn Lett 35:46-7View Article
    11.Montage Healthcare Solutions I, Yottalook. http://?www.?yottalook.?com/-/span>
    12.Müller H, de Herrera AGS, Kalpathy-Cramer J, Demner-Fushman D, Antani S, Eggel I (2012) Overview of the ImageCLEF 2012 medical image retrieval and classification tasks
    13.Müller H, Kalpathy-Cramer J, Demner-Fushman D, Antani S (2012) Creating a classification of image types in the medical literature for visual categorization. In: SPIE medical imaging
    14.Park DK, Jeon YS, Won CS (2000) Efficient use of local edge histogram descriptor. In:Proceedings of the 2000 ACM workshops on multimedia., Multimedia -0ACM, New York, NY, USA, pp 51-4
    15.Rahman M, You D, Simpson M, Antani SK, Demner-Fushman D, Thoma GR (2013) Multimodal biomedical image retrieval using hierarchical classification and modality fusion. Int J Multimed Inform Retriev 2(3):159-73View Article
    16.Richarz J, Vajda S, Grzeszick R, Fink GA (2014) Semi-supervised learning for character recognition in historical archive documents. Pattern Recogn 47(3):1011-020View Article
    17.Rokach L (2009) Pattern classification using ensemble methods, series in machine perception and artificial intelligence. World Scientific Publishing Company, Singapore
    18.Settles B (2009) Active learning literature survey. Tech. Rep. 1648, University of Wisconsin-Madison
    19.Simpson MS, Rahman MM, Phadnis S, Apostolova E, Demner-Fushman D, Antani S, Thoma GR (2011) Text and content-based approaches to image modality classification and retrieval for the imageclef 2011 medical retrieval track. In: CLEF (Notebook Papers/Labs/Workshop)
    20.Sugar CA, James GM (2003) Finding the number of clusters in a dataset: an information-theoretic approach. J Am Stat Assoc 98(463):750-63View Article MATH MathSciNet
    21.Toselli AH, Romero V, Pastor M, Vidal E (2010) Multimodal interactive transcription of text images. Pattern Recogn 43(5):1814-825View Article MATH
    22.Vajda S, Junaidi A, Fink GA (2011) A semi-supervised ensemble learning approach for character labeling with minimal human effort. In: ICDAR, pp 259-63 (2011)
    23.You D, Rahman MM, Antani S, Demner-Fushman D, Thoma GR (2013) Text- and content-based biomedical image modality classification. In: Proceedings of SPIE medical imaging, pp 86740L-6740L-
    24.Zhou ZH (2009) When semi-supervised learning meets ensemble learning. In: MCS, pp 529-38 (2009)
  • 作者单位:Szilárd Vajda (1)
    Daekeun You (1)
    Sameer Antani (1)
    George Thoma (1)

    1. National Library of Medicine, National Institutes of Health, Maryland, USA
  • 刊物主题:Multimedia Information Systems; Information Storage and Retrieval; Information Systems Applications (incl. Internet); Data Mining and Knowledge Discovery; Image Processing and Computer Vision; Computer Science, general;
  • 出版者:Springer London
  • ISSN:2192-662X
文摘
Medical image modality detection is a key step for indexing images from biomedical articles. Traditionally, complex supervised classification methods have been used for this. However, they rely on proportionally sized labeled training samples. With the increase in availability of image data it has become increasingly challenging to obtain reasonably accurate manual labels to train classifiers. Toward meeting this shortcoming, we propose a semi-automatic labeling strategy that reduces the human annotator effort. Each image is projected into several feature spaces, and each entry in these spaces is clustered in an unsupervised manner. The cluster centers for each feature representation are then labeled by a human annotator, and the labels propagated through each cluster. To find the optimal cluster numbers for each feature space, a so-called “jump-method is used. The final label of an image is decided by a voting scheme that summarizes the different opinions on the same image provided by the different feature representations. The proposed method is evaluated on ImageCLEFmed2012 data set containing approximately 300,000 images, and showed that annotating \(<\)1?% of the data is sufficient to label correctly 49.95?% of the images. The method spared approximately 700?h of human annotation labor and associated costs.
NGLC 2004-2010.National Geological Library of China All Rights Reserved.
Add:29 Xueyuan Rd,Haidian District,Beijing,PRC. Mail Add: 8324 mailbox 100083
For exchange or info please contact us via email.