     本文围绕面向汽车辅助驾驶的夜间行人检测问题展开研究,基于配备单目摄像头的车载平台,研究解决保障检测系统实时性、准确性、适合于变化场景的行人检测问题,涉及候选区域(Regions of Interest, ROIs)提取方法、远红外行人描述特征的提取方法、行人识别方法等方面关键技术的研究。论文的主要贡献如下:
     2)基于统计学习的识别框架,提出熵加权方向梯度直方图特征(Entropy WeightedHistograms of Oriented Gradients, EWHOG)描述远红外行人,综合了所描述目标的局部形状信息和局部梯度分布的随机信息,确保其局部形状能够更好地被局部密集像素梯度或边缘方向表示;为了解决因成像尺度不一等因素引起的目标类内方差较大的问题,提出基于EWHOG特征的三分支结构支持向量机(Support Vector Machine, SVM)行人识别方法,并利用快速分类支持向量机(Fast Classification Support Vector Machine,FCSVM)对获得的支持向量进行优化,从而约简识别环节所需要的计算和存储开销;根据远红外行人头部及其与周围背景之间灰度分布的差异性,提出进一步抑制误检目标的行人头部校验方法。实验表明:EWHOG特征能有效区分远红外行人;快速分类方案以轻微降低行人识别准确率为代价,保证检测系统运行的实时性,在市区和郊区场景中均获取了较好的检测性能。
     3)针对行人检测本质上属于“稀有事件检测”问题的特点,从ROIs提取的角度出发,提出一种基于像素梯度的垂直投影方法,根据远红外图像中天空与路面等背景区域通常具有大范围高灰度同质性的特点,利用图像梯度信息对可能包含行人的竖直带状图像区域进行初定位,避免对整幅输入图像进行搜索;实验表明该方法能够提高ROIs提取阶段的搜索效率,并能够抑制部分仅包含背景目标的候选区域。在行人识别阶段,将图像空间金字塔表示方法融入EWHOG特征的提取过程,在多层图像片(cell)划分方式下,利用局部方向梯度直方图的熵加权分布特性及其全局结构信息表征远红外行人,提出了金字塔熵加权方向梯度直方图(Pyramid Entropy Weighted Histograms of OrientedGradients, PEWHOG)特征;鉴于PEWHOG特征属于直方图统计特征,利用基于直方图交叉核(Histogram Intersection Kernel, HIK)的SVM分类器实现行人识别;针对收集具有代表性的训练数据较为困难、行人分类器的预测性能依赖于初始训练数据的问题,提出基于bootstrapping和提前终止策略的离线训练机制。
Pedestrian detection based on far-infrared (FIR) imageries has become a hot spot incomputer vision and pattern recognition community. FIR imageries capture the targets withdifferent distribution of surface temperature and thermal radiation emissivity and do notdepend on the illumination conditions, which makes it suitable to capture pedestrians indarkness and scenarios permeated with smoke. So it gains important potential in automotivedriver assistance systems and transportation video surveillance in night time scenarios. Thewide variety of possible appearances and scales of pedestrians caused by their non-rigidcharacteristic and high arbitrariness of motions usually leads to higher within-class variance.And compared with the imageries in visible spectrum, pedestrians in FIR imageries alsopresent as blur targets with lower resolution and less texture information. Therefore,pedestrian detection based on FIR imageries is a challenging task.
     This dissertation focuses on the issues of night time pedestrian detection for automotivedriver assistance systems uisng monocular FIR camera, aiming at (1) guaranteeing reliableperformance for automotive applications, with both real time implementation and highdetection accuracy;(2) dealing with pedestrian detection across unseen scenarios and newviewpoints. The main contents refer to the extraction of regions of interest (ROIs), featurerepresentation for FIR pedestrians and the framework of pedestrian recognition, which can besummarized as follows:
     1) A night time pedestrian detection method is proposed based on probabilistic templatematching, where the multi-scale probabilistic templates are established according to themoving directions of pedestrians and employ to recognize the potential pedestrians. Theprobabilistic templates alleviate the large within-class variability of pedestrians caused by thechanging appearance and thus improve the accuracy for describing appearance of pedestrians.Due to the characteristic of detection agreement of pedestrians among several successiveframes, an object tracking and multi-frame validation module is integrated in templatesmatching to suppress some false detection and fill the detection gap caused by the inaccurateextracted ROIs. The experimental results demonstrate that the proposed method meetsreal-time implementation criteria and the resulting probabilistic templates guarantees higheraccuracy for describing pedestrians’ appearance, compared to the ones based on gait patternsof pedestrians.
     2) Following a learing-based detection framework, we first propose entropy weightedhistograms of oriented gradients (EWHOG) to describe FIR pedestrians effectively. Considering both the information of local object shape and microdistributed chaotic degreesof local oriented graident distribution, EWHOG aims to pay more emphasis on thedistribution of local intensity gradients provided by local object shape. To reduce thewithin-class variance of objects located at different distances, a three-branch classifiercombining EWHOG features and supoort vector machine (SVM) is presented to recognizepedestrians. To reduce the computational and storage overhead, the resulting support vectorsare optimized using fast classification supoort vector machine (FCSVM). A further validationphase is then proposed to suppress some flase detection according to the intensity differencebetween FIR pedestrians’ heads and their adjacent regions. Experiments show that theproposed EWHOG is more approapriate to distinguish FIR pedestrians; the fast pedestrianrecognition framework guarantees higher implementation efficiency and the results in bothurban and suburban scenarios demonstrate its acceptable detection performance, at the cost ofonly slightly decrease of detection accuracy.
     3) Considering the rare-event-detection inherent in the tasks of pedestrian detectionwhere rare pedestrians need to be located from enormous background regions in the imagesequences, this dissertation proposes a pre-segmentation method called pixel-gradientoriented vertical projection to efficiently locate the vertical image stripes that probablycontain FIR pedestrians, which avoids the dense search within the whole input images. It isbased on the feature that the ground and sky in FIR images usually represent as largehomogeneous regions, which makes it possible to perform pixel-gradient oriented verticalprojection using the gradient information. Experimental results indicate that thepre-segmentation method significantly improves the speed of ROIs extraction and helps tofilter out some negative ROIs. In order to capture both the local object shape described by theentropy weighted distribution of oriented gradient histograms and its pyramid spatial layout, anovel pyramid entropy weighted histograms of oriented gradients (PEWHOG) is proposed todescribe FIR pedestrians. Then PEWHOG is fed to a three-branch structured SVM classifierusing histogram intersection kernel (HIK). An off-line training procedure combining both thebootstrapping and early-stopping strategy is proposed to generate a more robust classifier byexploiting hard negative samples iteratively, which also deals with the issue thatgeneralization ability of the resulting classifier depends on the initial training data.
     4) Under a traditional learning-based pedestrian detection framework, an FIR pedestrianclassifier trained by data extracted from one scenario may face difficulty in detectingpedestrians correctly in another distinct scenario due to the inevitable disparity in distributionsbetween the training data and test data. And it is expensive and sometimes difficult to label sufficient new training data from target domains to re-train a scenario-specific classifier. Tothis end, this dissertation proposes a novel Boosting-style algorithm for data-level transferlearning termed DTLBoost to detect FIR pedestrians towards distinct scenarios adaptationefficiently and effectively, which requires only a small amount of newly labeled training datafrom the target domains. To achieve better Boosting-style ensembles for inductive transferlearning, the degree of classification disagreement is formulated explicitly and incorporatedinto the weight updating rules of training samples. It helps to select the samples in auxiliarydata with positive transferability and encourage different base learners to learn different partsor aspects of target data. Extensive experiments including the performance evaluation of bothclassifier-level and system-level has been conducted to validate the effectiveness of theproposed method using our FIR pedestrian dataset and OSU thermal pedestrian dataset. Theresults demonstrate that the proposed method can impressively improve the detectionperformance across distinct scenarios, i.e. towards both new scenes and viewpointsadaptation.
