Automatic content understanding with cascaded spatial-temporal deep framework for capsule endoscopy videos

详细信息查看全文

作者：Honghan Chen^a ; ^{honghanchen@hotmail.com}Author Vitae ; Xiao Wu^a ; ^{wuxiaohk@home.swjtu.edu.cn}Author Vitae ; Gan Tao^b ; ^{gantao-1@163.com}Author Vitae ; Qiang Peng^a ; ^{qpeng@home.swjtu.edu.cn}Author Vitae
关键词：Wireless capsule endoscopy ; Convolutional neural network ; Topographic segmentation ; Content understanding ; Hidden Markov model
刊名：Neurocomputing
出版年：2017
出版时间：15 March 2017
年：2017
卷：229
期：Complete
页码：77-87
全文大小：1187 K
卷排序：229

文摘

Capsule endoscopy (CE) is the first-line diagnostic tool for inspecting gastrointestinal (GI) tract diseases. It is a tremendous task on examining and managing the CE videos by endoscopists. Therefore, a computer-aided diagnosis system is desired and urgent. In this paper, a general cascaded spatial–temporal deep framework is proposed to understand the most commonly seen contents of whole GI tract videos. First, the noisy contents such as feces, bile, bubble, and low power images are detected and removed by a Convolutional Neural Network (CNN) model. The clear images are then classified into entrance, stomach, small intestine, and colon by the second CNN. Finally, the topographic segmentation of the whole video is performed with a global temporal integration strategy by Hidden Markov Model (HMM). Compared to existing methods, the proposed framework performs noise content detection and topographic segmentation at the same time, which significantly reduces the number of images to be checked by endoscopists and segments images of different organs more accurately. Experiments on a dataset with 630K images from 14 patients demonstrate that the proposed approach achieves a promising performance in terms of effectiveness and efficiency.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700