基于SVC和过采样的类别非均衡农业高光谱数据分类
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Classification of Unbalanced Agricultural Hyperspectral Data Based on SVC and Oversampling
  • 作者:袁培森 ; 翟肇裕 ; 任守纲 ; 顾兴健 ; 徐焕良
  • 英文作者:YUAN Peisen;ZHAI Zhaoyu;REN Shougang;GU Xingjian;XU Huanliang;College of Information Science and Technology,Nanjing Agricultural University;Superior School of Technical Engineering and Telecommunication Systems,Technical University of Madrid;National Engineering and Technology Center for Agriculture;
  • 关键词:高光谱数据分类 ; 支持向量分类 ; 过采样 ; 非均衡数据 ; SMOTE
  • 英文关键词:hyperspectral data classification;;SVC;;oversampling;;imbalanced data;;SMOTE
  • 中文刊名:NYJX
  • 英文刊名:Transactions of the Chinese Society for Agricultural Machinery
  • 机构:南京农业大学信息科学技术学院;马德里理工大学技术工程和电信系统高级学院;国家信息农业工程技术中心;
  • 出版日期:2019-04-08 16:28
  • 出版单位:农业机械学报
  • 年:2019
  • 期:v.50
  • 基金:国家自然科学基金项目(61502236);; 中央高校基本科研业务费专项资金项目(KYZ201752、KJQN201651)
  • 语种:中文;
  • 页:NYJX201906029
  • 页数:8
  • CN:06
  • ISSN:11-1964/S
  • 分类号:265-272
摘要
系统研究了农业高光谱数据中少数类的分类质量问题。为了提升少数类的分类质量,提出采用过采样SMOTE技术增加少数类新样本,同时研究了SMOTE技术中新样本生成策略和少数类采样倍率对高光谱数据中少数类分类结果的影响,以及不平衡数据集上分类器与模型的匹配度。在新的采样数据集上采用多类分类SVC技术对少数类分类,提升了非均衡高光谱数据集中少数类的分类质量。在真实数据集上进行了试验验证,并对不同的分类方法和系统参数进行了试验对比和分析,结果表明,本文方法能够显著地提高非均衡高光谱数据中少数类分类效果,平均分类精度不小于0. 82,平均召回率提升幅度为11. 11%~26. 15%,F1提升幅度为5. 81%~40. 85%。
        Hyperspectral technology is widely used in agricultural natural resources such as agroecological environment and land resource protection. Spectral imaging technology can effectively classify and identify ground objects. Therefore,the classification of hyperspectral data is one of the important contents of hyperspectral research. Category non-equilibrium problem is a common problem in agricultural hyperspectral data,and the classification quality of minority classes has great significance for the effective classification of hyperspectral data. However,the classification of minority classes is affected by the dominant majority classes. The general classification algorithm tends to the dominant majority classes classification,so that minority classes are usually submerged in the majority classes,bringing great challenge to classification accuracy and recall rate of the minority classes. The classification quality of the minority objects was studied in agricultural hyperspectral data. In order to improve the classification quality of minority classes,an oversampling technique SMOTE was proposed to add new samples for the minority classes. At the same time,the effects of new sample generation strategy and minority instance sampling rate on the classification results of minority samples in the agricultural hyperspectral data and the matching degree between the classifier and the model on the unbalanced data set were systematically studied. A multi-class classification SVC technique was used to classify minority classes on a new sampling data set,and it improved the classification accuracy of the minority classes in unbalanced agricultural hyperspectral dataset. The experimental verification was carried out on the real data set,and different classification methods and system parameters were tested and compared. The experimental results showed that the proposed method can greatly improve the effect of minority classification in unbalanced agricultural hyperspectral data. The weight precision can reach above 0. 82,the weight recall rate was obviously improved from 11. 11% to 26. 15%,and F1 was increased from 5. 81% to 40. 85%.The method can provide a reference for the unbalanced agricultural hyperspectral data to improve the classification effect systematically.
引文
[1]王俊淑,江南,张国明,等.高光谱遥感图像DE-self-training半监督分类算法[J/OL].农业机械学报,2015,46(5):239-244.WANG Junshu,JIANG Nan,ZHANG Guoming,et al. Semi-supervised classification algorithm for hyperspectral remote sensing image based on DE-self-training[J/OL]. Transactions of the Chinese Society for Agricultural Machinery,2015,46(5):239-244. http:∥www. j-csam. org/jcsam/ch/reader/view_abstract. aspx? file_no=20150534&flag=1. DOI:10. 6041/j.issn. 1000-1298. 2015. 05. 034.(in Chinese)
    [2]高恒振.高光谱遥感图像分类技术研究[D].长沙:国防科学技术大学,2011.GAO Hengzhen. Research on classification technology of hyperspectral remote sensing images[D]. Changsha:National University of Defense Technology,2011.(in Chinese)
    [3]孙俊,路心资,张晓东,等.基于高光谱图像的红豆品种GA-PNN神经网络鉴别[J/OL].农业机械学报,2016,47(6):215-221.SUN Jun,LU Xinzi,ZHANG Xiaodong,et al. Identification of red bean variety with probabilistic GA-PNN based on hyperspectral imaging[J/OL]. Transactions of the Chinese Society for Agricultural Machinery,2016,47(6):215-221. http:∥www. j-csam. org/jcsam/ch/reader/view_abstract. aspx? file_no=20160628&flag=1. DOI:10. 6041/j. issn. 1000-1298.2016. 06. 028.(in Chinese)
    [4] CHEN X,LI S,PENG J. Hyperspectral imagery classification with multiple regularized collaborative representations[J]. IEEE Geoscience&Remote Sensing Letters,2017,14(7):1121-1125.
    [5]姚付启,张振华,杨润亚,等.基于红边参数的植被叶绿素含量高光谱估算模型[J].农业工程学报,2009,25(13):123-129.YAO Fuqi,ZHANG Zhenhua,YANG Runya,et al. Hyperspectral models for estimating vegetation chlorophyll content based on red edge parameter[J]. Transactions of the CSAE,2009,25(13):123-129.(in Chinese)
    [6]刘小丹,冯旭萍,刘飞,等.基于近红外高光谱成像技术鉴别杂交稻品系[J].农业工程学报,2017,33(22):189-194.LIU Xiaodan,FENG Xuping,LIU Fei,et al. Identification of hybrid rice strain based on near-infrared hyperspectral imaging technology[J]. Transactions of the CSAE,2017,33(22):189-194.(in Chinese)
    [7] HAN M,ZHANG C. Spectral-spatial classification of hyperspectral image based on discriminant sparsity preserving embedding[J]. Neurocomputing,2017,243:133-141.
    [8]张号逵,李映,姜晔楠.深度学习在高光谱图像分类领域的研究现状与展望[J].自动化学报,2018,44(6):961-977.ZHANG Haokui,LI Ying,JIANG Yenan. Deep learning for hyperspectral imagery classification:the state of the art and prospects[J]. Acta Automatica Sinica,2018,44(6):961-977.(in Chinese)
    [9] KRAWCZYK B. Learning from imbalanced data:open challenges and future directions[J]. Progress in Artificial Intelligence,2016,5(4):221-232.
    [10] ZHANG X,SONG Q,ZHENG Y,et al. Classification of imbalanced hyperspectral imagery data using support vector sampling[C]∥Geoscience and Remote Sensing Symposium. IEEE,2014:2870-2873.
    [11] SUN T,JIAO L,FENG J,et al. Imbalanced hyperspectral image classification based on maximum margin[J]. IEEE Geoscience&Remote Sensing Letters,2015,12(3):522-526.
    [12] LI J,DU Q,LI W,et al. Representation-based hyperspectral image classification with imbalanced data[C]∥Geoscience and Remote Sensing Symposium. IEEE,2016:3318-3321.
    [13] CHAO S,CHU H. The imbalanced hyperspectral image classification based on sparse MK-LSSVM[J]. Urban Geotechnical Investigation&Surveying,2016(2):69-73.
    [14] LI J,DU Q,LI Y,et al. Hyperspectral image classification with imbalanced data based on orthogonal complement subspace projection[J]. IEEE Transactions on Geoscience and Remote Sensing,2018,56(7):3838-3851.
    [15] HE H,BAI Y,GARCIA E A,et al. ADASYN:adaptive synthetic sampling approach for imbalanced learning[C]∥2008IEEE International Joint Conference on Neural Networks. IEEE,2008:1322-1328.
    [16] ZHANG L,WANG W X. A re-sampling method for class imbalance learning with credit data[C]∥International Conference of Information Technology,Computer Engineering and Management Sciences. IEEE Computer Society,2011:393-397.
    [17] HE F,YANG H,MIAO Y,et al. A cost sensitive and class-imbalance classification method based on neural network for disease diagnosis[C]∥International Conference on Information Technology in Medicine and Education. IEEE,2017:7-10.
    [18] NGUYEN H M,COOPER E W,KAMEI K. A comparative study on sampling techniques for handling class imbalance in streaming data[C]∥Joint,International Conference on Soft Computing and Intelligent Systems. IEEE,2013:1762-1767.
    [19] CHAWLA N V,BOWYER K W,HALL L O,et al. SMOTE:synthetic minority over-sampling technique[J]. Journal of Artificial Intelligence Research,2002,16(1):321-357.
    [20] GARCIA V,SáNCHEZ J S,MOLLINEDA R A. Classification of high dimensional and imbalanced hyperspectral imagery data[C]∥Iberian Conference on Pattern Recognition and Image Analysis,Springer,2011:644-6511.
    [21] GRAVES S,ASNER G,MARTIN R,et al. Tree species abundance predictions in a tropical agricultural landscape with a supervised classification model and imbalanced data[J]. Remote Sensing,2016,8(2):1-22.
    [22] GALL J,RAZAVI N,GOOL L V. An introduction to random forests for multi-class object detection[M]. Outdoor and LargeScale Real-World Scene Analysis. Springer Berlin Heidelberg,2012:243-263.
    [23] SCHUBACH M,RE M,ROBINSON P N,et al. Imbalance-aware machine learning for predicting rare and common diseaseassociated non-coding variants[J]. Scientific Reports,2017,7(1):2959.
    [24] WANG Y,LI X,TAO B. Improving classification of mature microRNA by solving class imbalance problem[J]. Scientific Reports,2016,6:25941.
    [25] BRANCO P,RIBEIRO R P. A survey of predictive modeling on imbalanced domains[M]. ACM,2016.
    [26] SAIN H,PURNAMI S W. Combine sampling support vector machine for imbalanced data classification[J]. Procedia Computer Science,2015,72(1):59-66.
    [27] HAN J,KAMBER M. Data mining:concepts and techniques[M]. 3 rd. Morgan Kaufmann Publishers Inc.,2012:113-115.
    [28] BISHOP C M,NASRABADI N M. Pattern recognition and machine learning[M]. Academic Press,2006:461-462.
    [29] ULLRICH C. Support vector classification[M]. Springer Berlin Heidelberg,2009:345-356.
    [30] SHI Y,TIAN Y,KOU G,et al. Support vector machines for multi-class classification problems[M]. Springer London,2011:47-60.
    [31] CHATFIELD,DOUGLAS G. Practical statistics for medical research[J]. Anz Journal of Surgery,2010,61(12):963-964.
    [32] CUTLER A,CUTLER D R,STEVENS J R. Random forests[J]. Machine Learning,2004,45(1):157-176.
    [33] MELGANI F,BRUZZONE L. Classification of hyperspectral remote sensing images with support vector machines[J]. IEEE Transactions on Geoscience and Remote Sensing,2004,42(8):1778-1790.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700