Towards Categorization and Pose Estimation of Sets of Occluded Objects in Cluttered Scenes from Depth Data and Generic Object Models Using Joint Parsing
详细信息    查看全文
文摘
This work addresses the task of categorizing and estimating the six-dimensional poses of all visible and partly occluded objects present in a scene from depth image information, in the absence of ground truth training examples and exact geometrical models of objects. A novel multi-stage algorithm is proposed to perform this task by first estimating object category probabilities for each depth pixel using local depth features computed from multiple viewpoints. It then generates a large set of object category and pose pairs, and reduces this set via joint parsing to best match the observed scene depth and per-pixel object category probabilities, while minimizing the physical overlap between objects within the subset. A decision forest is trained on synthetic data and used to estimate pixel category probabilities which are then used to generate a set of pose estimates for all categories. Finally a combinatorial optimization algorithm is used to perform joint parsing to find a best subset of poses. The algorithm is applied to the challenging Heavily Occluded Object Challenge data set which contains depth data of sets of objects placed on a table and generic object models for each category, but does not include registered RGB data or human annotations for training. It is tested on difficult scenes containing 10 or 20 objects and successfully categorizes and localizes 29 % of objects. The joint parsing algorithm successfully categorizes and localizes 56 % of objects when ground truth poses are added to the set of pose estimates.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700