基于半监督主动学习的菊花表型分类研究
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Chrysanthemum Phenotypic Classification Based on Semi-supervised Active Learning
  • 作者:袁培森 ; 任守纲 ; 翟肇裕 ; 徐焕良
  • 英文作者:YUAN Peisen;REN Shougang;ZHAI Zhaoyu;XU Huanliang;College of Information Science and Technology,Nanjing Agricultural University;National Engineering and Technology Center for Agriculture;Superior School of Technical Engineering and Telecommunication Systems,Technical University of Madrid;
  • 关键词:菊花表型分类 ; 半监督学习 ; 图模型 ; one-hot编码 ; 主动学习 ; 熵最大化
  • 英文关键词:chrysanthemum phenotype classification;;semi-supervised learning;;graph model;;one-hot encode;;active learning;;entropy maximum
  • 中文刊名:NYJX
  • 英文刊名:Transactions of the Chinese Society for Agricultural Machinery
  • 机构:南京农业大学信息科学技术学院;国家信息农业工程技术中心;马德里理工大学技术工程和电信系统高级学院;
  • 出版日期:2018-06-13 19:40
  • 出版单位:农业机械学报
  • 年:2018
  • 期:v.49
  • 基金:国家自然科学基金项目(61502236);; 中央高校基本科研业务费专项资金项目(KYZ201752、KJQN201651)
  • 语种:中文;
  • 页:NYJX201809003
  • 页数:8
  • CN:09
  • ISSN:11-1964/S
  • 分类号:34-41
摘要
鉴于人工和专家分类模式的局限性,基于表型的菊花分类存在效率低下的问题。本文采用基于半监督主动学习技术,在已分类菊花数据的基础上,利用未标号菊花样本数据提供的信息,建立了菊花表型分类模型,提升了分类质量和效率。该模型可以不依赖外界交互,利用未标号样本来自动提升菊花分类的质量。为了训练学习模型,本文收集了菊花的表型特征数据,标注了菊花表型类别,并研究了菊花分类属性特征的编码技术。在此数据集上,采用基于图标号传播的半监督学习技术对未标号的菊花数据进行建模,为了提升半监督分类的有效性,在标号传播的基础上使用主动学习技术,采用熵最大策略来选择难以识别的样本,以改进分类质量。在该数据集上进行了试验验证,并进行了试验对比和分析,试验结果表明,本文方法能够较好地利用未标号菊花样本提升分类的精度,随着标号百分比从6.25%升至23%,识别精度达到0.7以上,标号百分比在81.25%时,平均识别精度和召回率分别达到0.91和0.88。
        Phenotype-based classification plays an essential role in plant research. Chrysanthemum flower has great momentous economic value and medicinal value,and has feature of morphological and genetic diversity as well. Due to the limitations of the artificial classification model by expert and the characteristic of genetic diversity,phenotype-based classification has been facing great challenges for its research. At present,the technologies and applications of machine learning and artificial intelligence are developing rapidly. With the vehicle of machine learning,the semi-supervised learning technology was employed to provide an effective way for improving the classification performance. This method was based on label propagation of graph model as well as active learning technique. According to this method,a small number of classified chrysanthemum data as well as a large amount of unlabeled chrysanthemum samples were exploited to improve the classification accuracy. This method can automatically make use of the unlabeled samples to improve the quality of chrysanthemum classification without relying on external interactions. The chrysanthemum phenotypic data was collected to train the learning model,and manually annotate the chrysanthemum category information. For exploiting the categorical attribute,the coding skill was studied as well. The label propagation of graph model was utilized by the semi-supervised learning skill for the unlabeled chrysanthemums. In order to improve the effectiveness of semi-supervised classification,active learning technique was applied,which was based on the entropy maximization strategy to select difficult-to-identify samples to improve classification performance further. Extensiveexperiments were conducted and comparisons were made. The experimental results showed that the unlabeled chrysanthemum samples can improve the classification accuracy remarkably,with the labeled ratio increasing from 6. 25% to 23%, the recognition accuracy rapidly reached 0. 7, the average recognition accuracy and recall rate can reach 0. 91 and 0. 88,respectively,when the labeled ratio was81. 25%. In conclusion,semi-supervised based learning for the intelligent identification and effective management of chrysanthemum flowers had great significance in theory and application for the studying of chrysanthemum phenotype.
引文
1 SINGH A,GANAPATHYSUBRAMANIAN B,SINGH A K,et al.Machine learning for high-throughput stress phenotyping in plants[J].Trends in Plant Science,2016,21(2):110-124.
    2 SHEN Z,MAO Y,WU D,et al.Comparative analysis of morphologic traits of 50 large-flowered chrysanthemum varieties[J].Agricultural Science&Technology,2016,17(2):317-322.
    3洪艳,白新祥,孙卫,等.菊花品种花色表型数量分类研究[J].园艺学报,2012,39(7):1330-1340.HONG Yan,BAI Xinxiang,SUN Wei,et al.The numerical classification of chrysanthemum flower color phenotype[J].Acta Horticulturae Sinica,2012,39(7):1330-1340.(in Chinese)
    4孙文松.菊花品种起源及形态学分类研究[J].黑龙江农业科学,2013(9):58-60.SUN Wensong.Varieties origin and morphological classification of chrysanthemum[J].Heilongjiang Agricultural Sciences,2013(9):58-60.(in Chinese)
    5伏静,戴思兰.基于高光谱成像技术的菊花花色表型和色素成分分析[J].北京林业大学学报,2016,38(8):88-98.FU Jing,DAI Silan.Analysis of color phenotypic and pigment contents of chrysanthemum based on hyperspectral imaging[J].Journal of Beijing Forestry University,2016,38(8):88-98.(in Chinese)
    6 ZHANG F,CHEN S,CHEN F,et al.Genetic analysis and associated SRAP markers for flowering traits of chrysanthemum(Chrysanthemum morifolium)[J].Euphytica,2011,177(1):15-24.
    7张莉俊,戴思兰.菊花种质资源研究进展[J].植物学报,2009,44(5):526-535.ZHANG Lijun,DAI Silan.Research advance on germplasm resources of Chrysanthemum×morifolium[J].Chinese Bulletin of Botany,2009,44(5):526-535.(in Chinese)
    8 LEVATI J,BRBI M,PERDIH T S,et al.Phenotype prediction with semi-supervised classification trees[C]∥International Workshop on New Frontiers in Mining Complex Patterns.Springer,2017:138-150.
    9刘建伟,刘媛,罗雄麟.半监督学习方法[J].计算机学报,2015,38(8):1592-1617.LIU Jianwei,LIU Yuan,LUO Xionglin.Semi-supervised learning methods[J].Chinese Journal of Computers,2015,38(8):1592-1617.(in Chinese)
    10 ZHU X,GOLDBERG A B.Introduction to semi-supervised learning[J].Synthesis Lectures on Artificial Intelligence and Machine Learning,2009,3(1):1-130.
    11 YU Z,LU Y,ZHANG J,et al.Progressive semisupervised learning of multiple classifiers[J].IEEE Transactions on Cybernetics,2017,48(2):689-702.
    12 ZHU S,SUN X,JIN D.Multi-view semi-supervised learning for image classification[J].Neurocomputing,2016,208(C):136-142.
    13 YANG Z,COHEN W W,SALAKHUTDINOV R.Revisiting semi-supervised learning with graph embeddings[C]∥International Conference on International Conference on Machine Learning.JMLR.org,2016:40-48.
    14 ZHANG S,LEI Y,ZHANG C,et al.Semi-supervised orthogonal discriminant projection for plant leaf classification[J].Pattern Analysis and Applications,2016,19(4):1-9.
    15 BEAULIEU-JONES B K,GREENE C S.Semi-supervised learning of the electronic health record for phenotype stratification[J].Journal of Biomedical Informatics,2016,64:168-178.
    16 CRIMINISI A,SHOTTON J,KONUKOGLU E.Decision forests:a unified framework for classification,regression,density estimation,manifold learning and semi-supervised learning[J].Foundations&Trends in Computer Graphics&Vision,2011,7(2-3):81-227.
    17 SINGH S,JANOOS F,PCOT T,et al.Identifying nuclear phenotypes using semi-supervised metric learning[C]∥International Conference on Information Processing in Medical Imaging.Inf Process Med Imaging,2011:398-410.
    18 DOOSTPARAST T A,PETZOLD L R.Graph-based semi-supervised learning with genomic data integration using conditionresponsive genes applied to phenotype classification[J].Journal of the American Medical Informatics Association Jamia,2018,25(1):99-108.
    19 SCHARR H,MINERVINI M,FRENCH A P,et al.Leaf segmentation in plant phenotyping:a collation study[J].Machine Vision and Applications,2016,27(4):585-606.
    20 AGARWAL V,PODCHIYSKA T,BANDA J M,et al.Learning statistical models of phenotypes using noisy labeled training data[J].Journal of the American Medical Informatics Association,2016,23(6):1166-1173.
    21 DU B,WANG Z,ZHANG L,et al.Robust and discriminative labeling for multi-label active learning based on maximum correntropy criterion[J].IEEE Transactions on Image Processing,2017,26(4):1694-1707.
    22 WIDMANN N,VERBERNE S.Graph-based semi-supervised learning for text classification[C]∥ACM SIGIR International Conference on Theory of Information Retrieval.ACM,2017:59-66.
    23 ZHA Z J,MEI T,WANG J,et al.Graph-based semi-supervised learning with multi-label[C]∥IEEE International Conference on Multimedia and Expo.IEEE,2008:1321-1324.
    24 DENOORD A V,KALCHBRENNER N,VINYALS O,et al.Conditional image generation with pixel CNN decoders[C]∥Advances in Neural Information Processing Systems,2016:4790-4798.
    25 KELLEY D R,SNOEK J,RINN J L.Basset:learning the regulatory code of the accessible genome with deep convolutional neural networks[J].Genome Research,2016,26(7):990-999.
    26 MILLER D J,UYAR H S.A mixture of experts classifier with learning based on both labelled and unlabelled data[J].IEEETransactions on Medical Imaging,1997,9(5):571-577.
    27周志华.机器学习[M].北京:清华大学出版社,2016.
    28 CHEN X,WANG T.Combining active learning and semi-supervised learning by using selective label spreading[C]∥IEEEInternational Conference on Data Mining Workshops.IEEE Computer Society,2017:850-857.
    29 FENG Y,HUAMG X,SHI L,et al.Learning with the maximum correntropy criterion induced losses for regression[J].Journal of Machine Learning Research,2015,16(1):993-1034.
    30 ZHOU D,HOFMANN T,SCHLKOPF B.Semi-supervised learning on directed graphs[C]∥Advances in Neural Information Processing Systems,2005:1633-1640.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700