基于R-FCN深度卷积神经网络的机器人疏果前苹果目标的识别

英文篇名：Recognition of apple targets before fruits thinning by robot based on R-FCN deep convolution neural network
作者：王丹丹 ; 何东健
英文作者：Wang Dandan;He Dongjian;College of Mechanical and Electronic Engineering,Northwest A&F University;Key Laboratory of Agricultural Internet of Things,Ministry of Agriculture and Rural Affairs;Shaanxi Key Laboratory of Agricultural Information Perception and Intelligent Service;
关键词：图像处理 ; 算法 ; 图像识别 ; 小苹果 ; 目标识别 ; 深度学习 ; R-FCN
英文关键词：image processing;;algorithms;;image recognition;;small apple;;target recognition;;deep learning;;R-FCN
中文刊名：NYGU
英文刊名：Transactions of the Chinese Society of Agricultural Engineering
机构：西北农林科技大学机械与电子工程学院;农业农村部农业物联网重点实验室;陕西省农业信息感知与智能服务重点实验室;
出版日期：2019-02-08
出版单位：农业工程学报
年：2019
期：v.35;No.355
基金：国家高技术研究发展计划(863计划)资助项目(2013AA100304)
语种：中文;
页：NYGU201903020
页数：8
CN：03
ISSN：11-2047/S
分类号：164-171

摘要

疏果前期苹果背景复杂、光照条件变化、重叠及被遮挡,特别是果实与背景叶片颜色极为相近等因素,给其目标识别带来很大困难。为识别疏果前期的苹果目标,提出基于区域的全卷积网络(region-based fully convolutional network,R-FCN)的苹果目标识别方法。该方法在研究基于ResNet-50和ResNet-101的R-FCN结构及识别结果的基础上,改进设计了基于ResNet-44的R-FCN,以提高识别精度并简化网络。该网络主要由ResNet-44全卷积网络、区域生成网络(RegionProposal Network, RPN)及感兴趣区域(Region of Interest, RoI)子网构成。ResNet-44全卷积网络为基础网络,用以提取图像的特征,RPN根据提取的特征生成Ro I,然后Ro I子网根据ResNet-44提取的特征及RPN输出的Ro I进行苹果目标的识别与定位。对采集的图像扩容后,随机选取23 591幅图像作为训练集,4 739幅图像作为验证集,对网络进行训练及参数优化。该文提出的改进模型在332幅图像组成的测试集上的试验结果表明,该方法可有效地识别出重叠、被枝叶遮挡、模糊及表面有阴影的苹果目标,识别的召回率为85.7%,识别的准确率为95.1%,误识率为4.9%,平均速度为0.187 s/幅。通过与其他3种方法进行对比试验,该文方法比FasterR-CNN、基于ResNet-50和ResNet-101的R-FCN的F1值分别提高16.4、0.7和0.7个百分点,识别速度比基于ResNet-50和ResNet-101的R-FCN分别提高了0.010和0.041 s。该方法可实现传统方法难以实现的疏果前苹果目标的识别,也可广泛应用于其他与背景颜色相近的小目标识别中。
Before fruit thinning,factors such as complex background,various illumination conditions,foliage occlusion,fruit clustering,especially the extreme similarities between apples and background,made the recognition of small apple targets very difficult.To solve these problems,a recognition method based on region-based fully convolutional network(R-FCN) was proposed.Firstly,deep convolution neural network including ResNet-50 based R-FCN and ResNet-101 based R-FCN were studied and analyzed.After analyzing the framework of the 2 networks,it was obviously that the difference between these 2 networks was the 'conv4' block.The 'conv4' block of ResNet-101 based R-FCN was 51 more layers than that of ResNet-50 based R-FCN,but the recognition accuracy of the 2 networks was almost the same.By comparing the framework and recognition result of ResNet-50 based R-FCN and ResNet-101 based R-FCN,A R-FCN based on ResNet-44 was designed to improve the recognition accuracy and simplify the network.The main operation to simplify the network was to simplify the 'conv4' block,and the 'conv4' block of ResNet-44 based R-FCN was 6 layers less than that of ResNet-50 based R-FCN.The Res Net-44 based R-FCN consisted of ResNet-44 fully convolutional network,region proposal network(RPN) and region of interest(RoI) sub-network.Res Net-44 fully convolutional network,the backbone network of R-FCN,was used to extract features of image.The features were then used by RPN to generate RoIs.After that,the features extracted by ResNet-44 fully convolutional network and RoIs generated by RPN were used by RoI sub-network to recognize and locate small apple targets.A total of 3 165 images were captured in an experimental apple orchard in College of Horticulture,Northwest A&F University,in City of Yangling,China.After image resizing and manual annotation,332 images,including 85 images captured under sunny direct sunlight condition,88 images captured under sunny backlight condition,86 images captured under cloudy direct sunlight condition,74 images captured under cloudy backlight condition,were selected as test set,and the other 2 833 images were used to train and optimize the network.To enrich image training set,data augment,including brightness enhancement and reduction,chroma enhancement and reduction,contrast enhancement and reduction,sharpness enhancement and reduction,and adding Gaussian noise,was performed,then a total of 28 330 images were obtained with 23 591 images randomly selected as training set,and the other 4 739 images as validation set.After training,the simplified ResNet-44 based R-FCN was tested on the test set,and the experimental results indicated that the method could effectively apply to images captured under different illumination conditions.The method could recognize clustering apples,occluded apples,vague apples and apples with shadows,strong illumination and weak illumination on the surface.In addition,apples divided into parts by branched or petiole cloud also be recognized effectively.Overall,the recognition recall rate could achieve 85.7%.The recognition accuracy and false recognition rate were 95.2% and 4.9%,respectively.The average recognition time was 0.187 s per image.To further test the performance of the proposed method,the other 3 methods were compared,including Faster R-CNN,ResNet-50 based R-FCN and ResNet-101 based R-FCN.The F1 of the proposed method was increased by 16.4,0.7 and 0.7 percentage points,respectively.The average running time of the proposed method improved by 0.010 and 0.041 s compared with that of Res Net-50 based R-FCN and Res Net-101 based R-FCN,respectively.The proposed method could achieve the recognition of small apple targets before fruits thinning which could not be realized by traditional methods.It could also be widely applied to the recognition of other small targets whose features are similar to background.

引文

[1]Gongal A,Amatya S,Karkee M,et al.Sensors and systems for fruit detection and localization:A review[J].Computers&Electronics in Agriculture,2015,116:8-19.
    [2]Jiang G Q,Zhao C J.Apple recognition based on machine vision[C]//International Conference on Machine Learning and Cybernetics.IEEE,2012:1148-1151.
    [3]Lu J,Sang N.Detecting citrus fruits and occlusion recovery under natural illumination conditions[J].Computers&Electronics in Agriculture,2015,110:121-130.
    [4]Ji W,Zhao D,Cheng F,et al.Automatic recognition vision system guided for apple harvesting robot[J].Computers&Electrical Engineering,2012,38(5):1186-1195.
    [5]Rizon M,Yusri N A N,Kadir M F A,et al.Determination of mango fruit from binary image using randomized Hough transform[C]//Eighth International Conference on Machine Vision.International Society for Optics and Photonics,2015,9875(3):1-5.
    [6]Silwal A,Gongal A,Karkee M.Identification of red apples in field environment with over the row machine vision system[J].Agricultural Engineering International:The CIGRJournal,2014,16(4):66-75.
    [7]Rakun J,Stajnko D,Zazula D.Detecting fruits in natural scenes by using spatial-frequency based texture analysis and multiview geometry[J].Computers&Electronics in Agriculture,2011,76(1):80-88.
    [8]Chaivivatrakul S,Dailey M N.Texture-based fruit detection[J].Precision Agriculture,2014,15(6):662-683.
    [9]Feng J,Wang S,Liu G,et al.A separating method of adjacent apples based on machine vision and chain code information[C]//International Conference on Computer and Computing Technologies in Agriculture,2012:258-267.
    [10]Arefi A,Motlagh A M,Mollazade K,et al.Recognition and localization of ripen tomato based on machine vision[J].Australian Journal of Crop Science,2011,5(10):1144-1149.
    [11]Zhou R,Damerow L,Sun Y,et al.Using colour features of cv.‘Gala’apple fruits in an orchard in image processing to predict yield[J].Precision Agriculture,2012,13(5):568-580.
    [12]Wachs J P,Stern H I,Burks T,et al.Low and high-level visual feature-based apple detection from multi-modal images[J].Precision Agriculture,2010,11(6):717-735.
    [13]Zhu A,Yang L.An improved FCM algorithm for ripe fruit image segmentation[C]//IEEE International Conference on Information and Automation.IEEE,2014:436-441.
    [14]Linker R,Cohen O,Naor A.Determination of the number of green apples in RGB images recorded in orchards[J].Computers&Electronics in Agriculture,2012,81(1):45-57.
    [15]Arefi A,Motlagh A M.Development of an expert system based on wavelet transform and artificial neural networks for the ripe tomato harvesting robot[J].Australian Journal of Crop Science,2013,7(5):699-705.
    [16]Lv Q,Cai J R,Liu B,et al.Identification of fruit and branch in natural scenes for citrus harvesting robot using machine vision and support vector machine[J].International Journal of Agricultural&Biological Engineering,2014,7(2):115-121.
    [17]Zhao C Y,Lee W S,He D J.Immature green citrus detection based on colour feature and sum of absolute transformed difference(SATD)using colour images in the citrus grove[J].Computers&Electronics in Agriculture,2016,124:243-253.
    [18]赵凯旋,何东健.基于卷积神经网络的奶牛个体身份识别方法[J].农业工程学报,2015,31(5):181-187.Zhao Kaixuan,He Dongjian.Recognition of individual dairy cattle based on convolutional neural networks[J].Transactions of the Chinese Society of Agricultural Engineering(Transactions of the CSAE)),2015,31(5):181-187.(in Chinese with English abstract)
    [19]Guo Y,Liu Y,Oerlemans A,et al.Deep learning for visual understanding:A review[J].Neurocomputing,2016,187:27-48.
    [20]周云成,许童羽,郑伟,等.基于深度卷积神经网络的番茄主要器官分类识别方法[J].农业工程学报,2017,33(15):219-226.Zhou Yuncheng,Xu Tongyu,Zheng Wei,et al.Classification and recognition approaches of tomato main organs based on DCNN[J].Transactions of the Chinese Society of Agricultural Engineering(Transactions of the CSAE),2017,33(15):219-226.(in Chinese with English abstract)
    [21]傅隆生,冯亚利,Elkamil Tola,等.基于卷积神经网络的田间多簇猕猴桃图像识别方法[J].农业工程学报,2018,34(2):205-211.Fu Longsheng,Feng Yali,Elkamil Tola,et al.Image recognition method of multi-cluster kiwifruit in field based on convolutional neural networks[J].Transactions of the Chinese Society of Agricultural Engineering(Transactions of the CSAE),2018,34(2):205-211.(in Chinese with English abstract)
    [22]Sa I,Ge Z,Dayoub F,et al.Deep Fruits:A fruit detection system using deep neural networks[J].Sensors,2016,16(8):1222.
    [23]Bargoti S,Underwood J.Deep fruit detection in orchards[C]//IEEE International Conference on Robotics and Automation,2017:3626-3633.
    [24]Chen S W,Skandan S S,Dcunha S,et al.Counting apples and oranges with deep learning:a data driven approach[J].IEEE Robotics&Automation Letters,2017,2(2):781-785.
    [25]Bargoti S,Underwood J P.Image segmentation for fruit detection and yield estimation in apple orchards[J].Journal of Field Robotics,2017,34(6):1039-1060.
    [26]Rahnemoonfar M,Sheppard C.Deep count:Fruit counting based on deep simulated learning[J].Sensors,2017,17(4):905.
    [27]Liu X,Chen S W,Aditya S,et al.Robust fruit counting:combining deep learning,tracking,and structure from motion[J].International Conference on Intelligent Robots and Systems,2018:1045-1052.
    [28]Stein M,Bargoti S.Underwood J.Image based mango fruit detection,localisation and yield estimation using multiple view geometry[J].Sensors,2016,16(11):1915.
    [29]Ren S,He K,Girshick R,et al.Faster R-CNN:Towards real-time object detection with region proposal networks[C]//International Conference on Neural Information Processing Systems,2015:91-99.
    [30]Dai J,Li Y,He K,et al.R-fcn:Object detection via region-based fully convolutional networks[C]//Advances in Neural Information Processing Systems.Curran Associates Inc.2016.
    [31]蒋胜,黄敏,朱启兵,等.基于R-FCN的行人检测方法研究[J].计算机工程与应用,2018,54(18):180-183.Jiang Sheng,Huang Min,Zhu Qibing,et al.Pedestrian detection method based on R-FCN[J].Computer Engineering and Applications,2018,54(18):180-183.(in Chinese with English abstract)
    [32]桑农,倪子涵.复杂场景下基于R-FCN的手势识别[J].华中科技大学学报:自然科学版,2017(10):54-58.Sang Nong,Ni Zihan.Gesture recognition based on R-FCNin complex scenes[J].Journal of Huazhong University of Science and technology:Natural Science Edition,2017(10):54-58.(in Chinese with English abstract)
    [33]徐逸之,姚晓婧,李祥,等.基于全卷积网络的高分辨遥感影像目标检测[J].测绘通报,2018,490(1):80-85.Xu Yizhi,Yao Xiaojing,Li Xiang,et al.Object detection in high resolution remote sensing images based on fully convolution networks[J].Bulletin of Surveying and Mapping,2018,490(1):80-85.(in Chinese with English abstract)
    [34]Krizhevsky A,Sutskever I,Hinton G E.ImageNet classification with deep convolutional neural networks[C]//International Conference on Neural Information Processing Systems.Curran Associates Inc.,2012:1097-1105.
    [35]Zeiler M D,Fergus R.Visualizing and understanding convolutional networks[C]//European Conference on Computer Vision,2014:818-833.
    [36]Simonyan K,Zisserman A.Very deep convolutional networks for large-scale image recognition[C]//International Conference on Learning Representations,2015:1-14.
    [37]Szegedy C,Liu W,Jia Y,et al.Going deeper with convolutionss[C]//IEEE Conference on Computer Vision and Pattern Recognition.IEEE Computer Society,2015:1-9.
    [38]He K,Zhang X,Ren S,et al.Deep residual learning for image recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition,2016:770-778.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700