基于半监督学习的遥感影像分类技术研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
基于统计模式识别的分类是遥感影像应用中最主要的信息提取方法。由于遥感影像信息统计分布的高度复杂性和随机性、人为选择样本时对待分类影像认知的有限性以及选择时的盲目性等因素,常导致得到的样本数量少且代表性不好,从而无法保证取得理想的分类效果。但目前传统的遥感影像分类方法研究和应用中都忽视了该问题,常以“样本选择无任何问题”为出发点评价方法的适用性和参数设置,这不利于遥感影像信息提取技术研究和应用的深入发展。
     本文的工作主要针对遥感影像分类中人为选择样本少且代表性不好的问题,研究如何充分挖掘分类器和待分类影像的潜力,使影像分类结果最大限度的得到改善。论文主要包括以下两个部分:第一,在深入分析遥感影像分类时经常出现的人为选取样本代表性不好问题的基础上,针对遥感影像数据特点和分类特点,引入半监督学习分类方法,开展了基于半监督学习的遥感影像分类技术研究;第二,面向遥感应用中全自动分类和对无法获取有效样本区域的分类问题,基于发展的适用于遥感影像的半监督学习分类技术,分别提出了有效的解决方法。具体内容如下:
     (1)发展了基于半监督学习的遥感影像分类方法
     在分析遥感影像数据特点的基础上,基于半监督学习方法中与遥感影像数据分布特点一致的两种假设,研究并发展了基于生成模型和直推式思想的遥感影像半监督学习分类技术。
     基于生成模型的技术研究中,推导并修正了EM算法的递归公式,从遥感影像分类应用的角度给出了修正依据;面向遥感影像分类特点和需求,给出在一个类别对应一个概率混合成分和一个类别对应多个概率混合成分时的分类方法,并分别给出相应的算法流程;发现了休斯现象在半监督学习分类中同样存在;根据设计的分类实验,从经验角度给出了已标记样本集和未标记样本的用量比例参考。
     针对现有基于直推式思想的遥感影像方法研究中,因未考虑遥感数据特点而导致未标记样本的标注效率低下、分类精度不高的问题,提出了一种适用于遥感影像的未标记样本标注方法;分别应用到目前主流的中分辨率遥感影像和高分辨率遥感影像分类中,给出了基于象元的分类方法流程,和基于分割对象的高分辨率遥感影像半监督学习分类方法流程框架,并指出发展的基于分割对象的分类方法体现了真正意义上的直推式学习。
     为更全面的评价和分析所发展的两种方法,开展了基于生成模型的方法和基于直推式方法的分类效果对比,并分析了两种方法的适用性。
     (2)基于发展的半监督分类方法解决了遥感应用中的两个技术难题
     面向遥感应用中对全自动分类技术和训练样本拓展应用技术的需求,在分析半监督学习方法在上述应用领域中的适用性的基础上,开展了相应的研究。
     遥感影像的全自动分类是今后大规模、高频度、重复性区域遥感监测技术发展的重要方向。提出一种以建立预设样本集的方式,基于半监督学习的遥感影像全自动分类技术。已标记样本不一定来自待分类影像本身,但却在一定程度上代表影像覆盖的区域。通过在区域性样本集建立策略下的全自动分类实验,给出了可行的基于半监督学习的遥感影像全自动分类方法。面向目前国家海域使用监测中对高自动化信息提取的需求,提出一种针对海岸带滩涂围垦信息的半监督学习全自动提取方法。
     遥感影像分类样本的时空拓展应用在灾情的应急监测、跨区域的遥感监测和跨国界的军事监测等方面意义重大。代表性不好是样本难以在覆盖不同时空的影像分类中有效应用的主要障碍。为应对该问题,提出了一种遥感影像半监督学习分类样本拓展技术。已标记样本完全不来自待分类影像,甚至无法确定对影像覆盖的区域的代表性。分别针对近距离、应急监测中的样本拓展应用,和远距离、跨国界科学研究和军事意义上的分类样本拓展,开展了实例应用。并根据实验结果对比分析了提出方法和一般半监督学习的异同。
The classification method based on statistical pattern recognition is one of the most important methods in remote sensing image information extraction field. Because of the high degree of complexity and randomness on the statistical distribution which remote sensing information holds, as well as the operator's limited knowledge of the images and the blind choice of training samples, the number of samples which we obtained usually uninsufficient and had poor representativeness. We cannot image that a good classification result can be obtained with these samples. But people often ignored this problem both in classification methods research and application, which is not conducive to remote sensing image classification and information extraction technology research development. This thesis mainly aims at dealing with the poor sample representativeness problem and studying how to mining the potentialities of the classification methods and the classified images, so that the classification results can be improved.
     The thesis begins with an overview of the basic theories and recent research on semi-supervised learning field and remote sensed image classification field. The bulk of author's contribution, which form the main part of this thesis, are the semi-supervised classification methods developing for remote sensed images, and its application to dealing with two important problems in remote sensing application field.
     (1) Semi-supervised classification methods development for remote senseing images
     In analyzing the characteristics of remote sensing image data, based on the two kinds of assumptions which the semi-supervised learning and remote sensing image data distribution share, the methods based on generative model and transductive model of semi-supervised learning for remote sensing image classification are proposed and developed.
     In generative model-based method studying, we re-derivate the EM algorithm and amendment a recursive formula, and give the reason why we do so. According to the characteristics and needs of remote sensing image classification, a one class corresponds to one component and a one class corresponds to more components models are given out, and the corresponding algorithm flows are developed respectively. We also find out that the Hughes phenomenon exists in semi-supervised learning category too. Based on the designed classification experiments, we give our suggestion that the usage ratio of unlabeled and labeled samples must not be too large or small.
     In transductive-based methods studying, we propose a new unlabelled sample labeling method according to the remote sensing data and application characteristics, then apply it to the current mainly used medium-low resolution remote sensing images and high-resolution remote sensing image classification. At last we give out the pixel-based and object-based semi-supervised classification process which has great difference with the traditional pixel-based classification method.
     In the end of this part, we compare the generative model method and the transductive model method by apply them to the same image classification.
     (2) Automatic classification and training samples space-time expanding methods development for remote sesing image based on the proposed semi-supervised methods
     Automatic classification is the important direction of technology development for large-scale, high frequency, repetitive regional remote sensing monitoring in the future. We propose a semi-supervised remotely sensed image classification method in the way of establishing training samples data sets. Here labeled samples are no longer come from the image to be classified itself, but to some extent on behalf of the image area covered. For the high demand of automatic information extraction from remote sensed images in state sea used monitoring, we proposed a coastal zone usage information automatic extraction method based on the semi-supervised classification method we proposed in the first part.
     Remote sensing image classification sample space-time expansion plays an important role in the emergency disaster monitoring, cross-regional and cross-border monitoring and other aspects of military. Poor representative is the main problem for training samples applied in the images which cover different space and time. We proposed a semi-supervised based method to deal with this problem. Labeled samples are completely not from the image to be classified and even cannot cover the place the image covers. The classification experiments on regional expanding and border expanding are carried out, and the result shows that our method works well.
     In the end, the methods which we proposed to dealing with the poor representative of man-picked training samples problem are summarized and discussed, and further research directions and goals are given out.
引文
[1]Shahshahani B M, Landgrebe D A. The effect of unlabeled samples in reducing the small sample size problem and mitigating the Hughes phenomenon[J]. IEEE Transactions on Geoscience and Remote Sensing.1994,32(5):1087-1095.
    [2]Duda R O, Hart P E, Stork D G. Pattern classification[M]. Citeseer,2001.
    [3]Scudder I I. Probability of error of some adaptive pattern-recognition machines[J]. IEEE Transactions on Information Theory.1965,11(3):363-371.
    [4]Fralick S. Learning to recognize patterns without a teacher[J]. IEEE Transactions on Information Theory.1967,13(1):57-64.
    [5]Zhu X. Semi-supervised learning literature survey[J]. Computer Science, University of Wisconsin-Madison.2006.
    [6]Zhu X. Semi-supervised learning literature survey[J]. Computer Science, University of Wisconsin-Madison.2008.
    [7]Vapnik V, Chervonenkis A. Theory of pattern recognition[Z]. Nauka, Moscow,1974.
    [8]Vapnik V N, Vapnik V. Statistical learning theory[M]. Wiley New York,1998.
    [9]Chapelle O, Sch B, Zien A. Semi-supervised learning[M]. Citeseer,2006.
    [10]Zhu X, Goldberg A B. Introduction to semi-supervised learning[M]. Morgan & Claypool Publishers,2009.
    [11]Castelli V, Cover T M. On the exponential value of labeled samples* 1[J]. Pattern Recognition Letters.1995,16(1):105-111.
    [12]Ratsaby J, Venkatesh S S. Learning from a mixture of labeled and unlabeled examples with parametric side information[C]. ACM,1995.
    [13]Cozman F G, Cohen I, Cirelo M C. Semi-supervised learning of mixture models[C].2003.
    [14]Nigam K, Ghani R. Analyzing the effectiveness and applicability of co-training[C]. ACM, 2000.
    [15]Schenker A, Bunke H, Last M, et al. A graph-based framework for web document mining[J]. Document Analysis Systems Ⅵ.2004:401-412.
    [16]Baldi P, Ralaivola L. Graph kernels for molecular classification and prediction of mutagenicity, toxicity, and anticancer activity[C].2004.
    [17]Vatsavai R R, Burk T E, Bolstad P V, et al. Multi-spectral image classification using spectral and spatial knowledge[J]. CISST,2001.2001.
    [18]Vatsavai R R, Shekhar S, Bhaduri B. A Semi-supervised Learning Algorithm for Recognizing Sub-classes[C].2008.
    [19]Vatsavai R R, Shekhar S, Burk T E. A semi-supervised learning method for remote sensing data mining[C].2005.
    [20]Vatsavai R R, Shekhar S, Burk T E. A spatial semi-supervised learning method for classification of multi-spectral remote sensing imagery[C].2006.
    [21]Vatsavai R R, Shekhar S, Burk T E. An efficient spatial semi-supervised learning algorithm[J]. International Journal of Parallel, Emergent and Distributed Systems.2007, 22(6):427-437.
    [22]Jackson Q, Landgrebe D A. An adaptive classifier design for high-dimensional data analysiswith a limited training data set[J]. IEEE Transactions on Geoscience and Remote Sensing.2001,39(12):2664-2679.
    [23]Dempster A P, Laird N M, Rubin D B. Maximum likelihood from incomplete data via the EM algorithm[J]. Journal of the Royal Statistical Society. Series B (Methodological).1977, 39(1):1-38.
    [24]Miller D J, Uyar H S. A mixture of experts classifier with learning based on both labelled and unlabelled data[J]. Advances in neural information processing systems.1997:571-577.
    [25]Callison-burch C, Talbot D, Osborne M. Statistical machine translation with word-and sentence-aligned parallel corpora[C].2004.
    [26]Corduneanu A, Jaakkola T. Stable Mixing of Complete and Incomplete Information[J]. 2007.
    [27]Demiriz A, Bennett K P, Embrechts M J. Semi-supervised clustering using genetic algorithms[J]. Artificial neural networks in engineering (ANNIE-99).1999:809-814.
    [28]Joachims T. Making large scale SVM learning practical[J].1999.
    [29]Mitra P, Uma S B, Pal S K. Segmentation of multispectral remote sensing images using active support vector machines[J]. Pattern recognition letters.2004,25(9):1067-1074.
    [30]Bruzzone L, Chi M, Marconcini M. A novel transductive SVM for semisupervised classification of remote-sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing.2006,44(11 Part 2):3363-3373.
    [31]Bruzzone L, Chi M, Marconcini M. Transductive SVMs for semisupervised classification of hyperspectral data[C].2005.
    [32]Camps-valls G, Bruzzone L. Kernel-based methods for hyperspectral image classification[J]. IEEE Transactions on Geoscience and Remote Sensing.2005,43(6):1351-1362.
    [33]Camps-valls G, Gomez-chova L, Munoz-mari J, et al. Composite kernels for hyperspectral image classification[J]. IEEE Geoscience and Remote Sensing Letters.2006,3(1):93-97.
    [34]Camps-valls G, Marsheva T V, Zhou D. Semi-supervised graph-based hyperspectral image classification[J]. IEEE Transactions on Geoscience and Remote Sensing.2007,45(10).
    [35]Camps-valls G, G C L, Mu M J, et al. Kernel-based framework for multitemporal and multisource remote sensing data classification and change detection[J]. IEEE Transactions on Geoscience and Remote Sensing.2008,46(6):1822-1835.
    [36]Tuia D, Camps-valls G. Semi-supervised remote sensing image classification with cluster kernels[J]. IEEE Geosci. Remote Sens. Lett.2009,6(2):224-228.
    [37]Jensen J R, Xiaoling C, Wei G. Introduction to remote sensing digital image processing[Z]. Beijing:Machinery Industry Press,2007.
    [38]陈云浩,冯通,史培军,et al基于面向对象和规则的遥感影像分类研究[J].武汉大学学报:信息科学版.2006,31(004):316-320.
    [39]汪闽,骆剑承,周成虎,et al结合高斯马尔可夫随机场纹理模型与支撑向量机在高分辨率遥感图像上提取道路网[J].遥感学报.2005,9(003):271-276.
    [40]王文宇,李静.面向对象的高分辨率遥感影像土地覆盖信息提取[J].测绘科学.2008(0S1).
    [41]史培军,宫鹏,李晓兵,et al土地利用/覆盖变化研究的方法与实践[J].北京:科学出版牡.2000.
    [42]Walter V. Object-based classification of remote sensing data for change detection[J]. ISPRS Journal of Photogrammetry and Remote Sensing.2004,58(3-4):225-238.
    [43]Asner G, Hicke J, Lobell D. Per-pixel Analysis of forest structure:Vegetation indices, spectral mixture analysis and canopy reflectance modeling[J]. Remote Sensing of Forest Environments:Concepts and Case Studies. Dordrecht, Holland. Kluwer Academic Publishers.2003:209-254.
    [44]System V L. User Manual:Feature Analyst Extension for ArcView/ArcGIS[M]. Missoula: MT,2002.
    [45]Atkinson P M. Resolution manipulation and sub-pixel mapping[J]. Remote Sensing Image Analysis:Including the Spatial Domain.2004:51-70.
    [46]Marceau D J. The scale issue in social and natural sciences[J]. Canadian Journal of Remote Sensing.1999,25(4):347-356.
    [47]Marceau D J, Hay G J. Remote sensing contributions to the scale issue[J]. Canadian Journal of Remote Sensing.1999,25(4):357-366.
    [48]Cleve C, Kelly M, Kearns F R, et al. Classification of the wildland-urban interface:A comparison of pixel-and object-based classifications using high-resolution aerial photography[J]. Computers, Environment and Urban Systems.2008,32(4):317-326.
    [49]Benz U C, Hofmann P, Willhauck G, et al. Multi-resolution, object-oriented fuzzy analysis of remote sensing data for GIS-ready information[J]. ISPRS Journal of Photogrammetry and Remote Sensing.2004,58(3-4):239-258.
    [50]Conchedda G, Durieux L, Mayaux P. An object-based method for mapping and change analysis in mangrove ecosystems[J]. ISPRS Journal of Photogrammetry and Remote Sensing.2008,63(5):578-589.
    [51]华学勇,孙睿英Apriori算法在遥感影像数据挖掘中的应用[J].测绘与空间地理信息.2009,32(006):131-132.
    [52]王海起,王劲峰.空间数据挖掘技术研究进展[J].地理与地理信息科学.2005,21(004):6-10.
    [53]王佐成,薛丽霞,凌聪.遥感图像纹理关联规则挖掘[J].重庆邮电大学学报:自然科学 版.2007,19(004):438-441.
    [54]袁林山,杜培军,张华鹏,et al基于决策树的CBERS遥感影像分类及分析评价[J].国土资源遥感.2008(002):92-98.
    [55]骆剑承,王钦敏,马江洪,et al遥感图像最大似然分类方法的EM改进算法[J].测绘学报.2002,31(003):234-239.
    [56]Seeger M. Learning with labeled and unlabeled data[R]. Citeseer,2000.
    [57]Mciver D K, Friedl M A. Using prior probabilities in decision-tree classification of remotely sensed data[J]. Remote Sensing of Environment.2002,81(2):253-261.
    [58]Wu W, Shao G. Optimal combinations of data, classifiers, and sampling methods for accurate characterizations of deforestation[J]. Canadian Journal of Remote Sensing.2002, 28(4):601-609.
    [59]Richards J A, Jia X. Remote sensing digital image analysis:an introduction[M]. Springer Verlag,2006.
    [60]Robert J S. Pattern Recognition:Statistical, Structural and Neural Approaches[J]. New York. 1992.
    [61]Swain P H, Davis S M. Remote sensing:the quantitative approach[M]. McGraw-Hill New York,1978.
    [62]Redner R A, Walker H F. Mixture densities, maximum likelihood and the EM algorithm[J]. SIAM review.1984:195-239.
    [63]Wu C F. On the convergence properties of the EM algorithm[J]. The Annals of Statistics. 1983,11(1):95-103.
    [64]Xu L, Jordan M I. On convergence properties of the EM algorithm for Gaussian mixtures[Z]. MIT Press,1995.
    [65]Congalton R G. A review of assessing the accuracy of classifications of remotely sensed data[J]. Remote sensing of environment.1991,37(1):35-46.
    [66]Chang C C, Lin C J. LIBSVM:a library for support vector machines[Z]. Citeseer,2001.
    [67]Kaufman Y J, Fraser R S. Atmospheric effect on classification of finite fields[J]. Remote Sensing of Environment.1984,15(2):95-118.
    [68]Cracknell A P, Hayes L. Introduction to remote sensing[M]. Guilford Press,1991.
    [69]Chapelle O, Weston J, Sch B, et al. Cluster kernels for semi-supervised learning[C]. like {% ISCB Student Council%},2006.
    [70]Kockelkorn M, L A, Scheffer T. Using transduction and multi-view learning to answer emails[J]. Knowledge Discovery in Databases:PKDD 2003.2003:266-277.
    [71]Nigam K P. Using unlabeled data to improve text classification[D]. Citeseer,2001.
    [72]陈毅松,汪国平,董士海.基于支持向量机的渐进直推式分类学习算法[J]. Journal of Software.2003,14(3).
    [73]沈新宇,许宏丽,官腾飞.基于直推式支持向量机的图像分类算法[J].计算机应用. 2007,27(006):1463-1464.
    [74]廖东平,魏玺章,黎湘,et al一种改进的渐进直推式支持向量机分类学习算法[J].信号处理.2008,24(002):213-218.
    [75]Vapnik V N. The nature of statistical learning theory[M]. Springer Verlag,2000.
    [76]Hay G J, Castilla G, Wulder M A, et al. An automated object-based approach for the multiscale image segmentation of forest scenes[J]. International Journal of Applied Earth Observation and Geoinformation.2005,7(4):339-359.
    [77]Kothainachiar S, Wahita R S. Unsupervised Morphological Segmentation for Textured and Non-Textured images[J]. GVIP Journal.2006:33-39.
    [78]O C R, Bull D R. Combined morphological-spectral unsupervised image segmentation[J]. IEEE Transactions on Image Processing.2005,14(1):49-62.
    [79]Blaschke T, Burnett C, Pekkarinen A. Image segmentation methods for object-based analysis and classification[J]. Remote sensing image analysis:Including the spatial domain. 2004:211-236.
    [80]Zhang Y J. A survey on evaluation methods for image segmentation* 1[J]. Pattern recognition.1996,29(8):1335-1346.
    [81]Chen J, Pappas T N, Mojsilovic A, et al. Adaptive image segmentation based on color and texture[C]. Citeseer,2002.
    [82]Hall O, Hay G J. A multiscale object-specific approach to digital change detection[J]. International Journal of Applied Earth Observation and Geoinformation.2003,4(4): 311-327.
    [83]Fauzi M F, Lewis P H. A fully unsupervised texture segmentation algorithm[C]. Citeseer, 2000.
    [84]Strobl J, Blaschke T. What's wrong with pixels? Some recent developments:interfacing remote sensing and GIS[J]. GeoBIT/GIS.2001,6:12-17.
    [85]Baatz M, Sch A. Multiresolution Segmentation-an optimization approach for high quality multi-scale image segmentation[C].2000.
    [86]2007 D. Definiens solutions[Z].2007.
    [87]Gong P, Marceau D J, Howarth P J. A comparison of spatial feature extraction algorithms for land-use classification with SPOT HRV data[J]. Remote Sensing of Environment.1992, 40(2):137-151.
    [88]周成虎骆剑承.高分辨率卫星遥感影像地学计算[M].科学出版社,2008.
    [89]宫鹏.遥感科学与技术中的一些前沿问题[J].遥感学报.2009,1:1-12.
    [90]宫鹏,黎夏,徐冰.高分辨率影像解译理论与应用方法中的一些研究问题[J].遥感学报.2006,10(001):1-5.
    [91]Fukunaga K. Introduction to statistical pattern recognition[M]. Academic Pr,1990.
    [92]Ehlers M, G M, Janowsky R. Automated analysis of ultra high resolution remote sensing data for biotope type mapping:new possibilities and challenges[J]. ISPRS Journal of Photogrammetry and Remote Sensing.2003,57(5-6):315-326.
    [93]Keuchel J, Naumann S, Heiler M, et al. Automatic land cover analysis for Tenerife by supervised classification using remotely sensed data[J]. Remote Sensing of Environment. 2003,86(4):530-541.
    [94]B Y, H A, Christen M, et al. Automated detection and mapping of avalanche deposits using airborne optical remote sensing data[J]. Cold Regions Science and Technology.2009, 57(2-3):99-106.
    [95]赵大昌.中国海岸带植被[M].海洋出版社,1996.
    [96]Mather P M. Computer processing of remotely sensed images:an introduction[M]. Wiley, 2004.
    [97]Du Y, Teillet P M, Cihlar J. Radiometric normalization of multitemporal high-resolution satellite images with quality control for land cover change detection[J]. Remote Sensing of Environment.2002,82(1):123-134.
    [98]张友水,冯学智,周成虎.多时相TM影像相对辐射校正研究[J].测绘学报.2006.
    [99]Koukal T, Suppan F, Schneider W. The impact of relative radiometric calibration on the accuracy of kNN-predictions of forest attributes[J]. Remote Sensing of Environment.2007, 110(4):431-437.
    [100]黄世奇,王善成.微波遥感SAR军事探测技术研究[J].飞航导弹.2005(004):13-16.
    [101]周义,李作君,童永安.遥感技术:汶川上空的“慧眼”[J].中国人民防空.2008(010):65-66.
    [102]马辉,许志辉,马浩录.黄河下游洪水卫星遥感监测方法研究与应用[J].人民黄河.2009(001):107-109.
    [103]黄慰军,黄镇,白彬,et al遥感火灾监测光谱数据的分析与应用[J].沙漠与绿洲气象.2007,1(006):14-16.
    [104]Loveland T R, Reed B C, Brown J F, et al. Development of a global land cover characteristics database and IGBP DISCover from 1 km AVHRR data[J]. International Journal of Remote Sensing.2000,21(6):1303-1330.
    [105]邓书斌,武红敢,江涛.基于PCA/NDVI的森林覆盖遥感信息提取方法研究[J].国土资源遥感.2007(002):82-85.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700