Multi-utility Learning: Structured-Output Learning with Multiple Annotation-Specific Loss Functions

详细信息查看全文

作者：Roman Shapovalov (19)
Dmitry Vetrov (19)
Anton Osokin (19) (20)
Pushmeet Kohli (21)
关键词：semantic image segmentation ; structured ; output learning ; weakly ; supervised learning ; loss functions
刊名：Lecture Notes in Computer Science
出版年：2015
出版时间：2015
年：2015
卷：8932
期：1
页码：406-420
全文大小：1,006 KB
参考文献：1. Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. PAMI聽23(11), 1222鈥?239 (2001) <a class="external" href="http://dx.doi.org/10.1109/34.969114" target="_blank" title="It opens in new window">CrossRefa>
2. Delong, A., Osokin, A., Isack, H.N., Boykov, Y.: Fast Approximate Energy Minimization with Label Costs. IJCV聽96(1), 1鈥?7 (2012) <a class="external" href="http://dx.doi.org/10.1007/s11263-011-0437-z" target="_blank" title="It opens in new window">CrossRefa>
3. Heitz, G., Koller, D.: Learning spatial context: Using stuff to find things. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol.聽5302, pp. 30鈥?3. Springer, Heidelberg (2008) <a class="external" href="http://dx.doi.org/10.1007/978-3-540-88682-2_4" target="_blank" title="It opens in new window">CrossRefa>
4. Joachims, T., Finley, T., Yu, C.: Cutting-plane training of structural SVMs. Machine Learning聽77(1), 27鈥?9 (2009) <a class="external" href="http://dx.doi.org/10.1007/s10994-009-5108-8" target="_blank" title="It opens in new window">CrossRefa>
5. Kumar, M.P., Turki, H., Preston, D., Koller, D.: Learning specific-class segmentation from diverse data. In: ICCV, pp. 1800鈥?807 (November 2011)
6. Ladick媒, 慕., Sturgess, P., Alahari, K., Russell, C., Torr, P.H.S.: What, Where and How Many? Combining Object Detectors and CRFs. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol.聽6314, pp. 424鈥?37. Springer, Heidelberg (2010) <a class="external" href="http://dx.doi.org/10.1007/978-3-642-15561-1_31" target="_blank" title="It opens in new window">CrossRefa>
7. Lempitsky, V., Kohli, P., Rother, C., Sharp, T.: Image segmentation with a bounding box prior. In: ICCV, pp. 277鈥?84 (September 2009)
8. Liu, K., Raghavan, S., Nelesen, S., Linder, C.R., Warnow, T.: Rapid and accurate large-scale coestimation of sequence alignments and phylogenetic trees. Science (New York, N.Y.)聽324(5934), 1561鈥?564 (2009) <a class="external" href="http://dx.doi.org/10.1126/science.1171243" target="_blank" title="It opens in new window">CrossRefa>
9. Pletscher, P., Kohli, P.: Learning low-order models for enforcing high-order statistics. In: AISTATS (2012)
10. Quattoni, A., Wang, S., Morency, L.P., Collins, M., Darrell, T.: Hidden conditional random fields. PAMI聽29(10), 1848鈥?853 (2007) <a class="external" href="http://dx.doi.org/10.1109/TPAMI.2007.1124" target="_blank" title="It opens in new window">CrossRefa>
11. Schwing, A.G., Hazan, T., Pollefeys, M., Urtasun, R.: Efficient Structured Prediction with Latent Variables for General Graphical Models. In: ICML (2012)
12. Shotton, J., Johnson, M., Cipolla, R.: Semantic texton forests for image categorization and segmentation. In: CVPR (June 2008)
13. Shotton, J., Winn, J.M., Rother, C., Criminisi, A.: / textonBoost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part I. LNCS, vol.聽3951, pp. 1鈥?5. Springer, Heidelberg (2006) <a class="external" href="http://dx.doi.org/10.1007/11744023_1" target="_blank" title="It opens in new window">CrossRefa>
14. Tarlow, D., Zemel, R.S.: Structured Output Learning with High Order Loss Functions. In: AISTATS (2012)
15. Taskar, B., Chatalbashev, V., Koller, D.: Learning associative Markov networks. In: ICML. pp. 102鈥?09, Banff, Alberta, Canada (2004)
16. Tighe, J., Lazebnik, S.: SuperParsing: Scalable Nonparametric Image Parsing with Superpixels. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol.聽6315, pp. 352鈥?65. Springer, Heidelberg (2010) <a class="external" href="http://dx.doi.org/10.1007/978-3-642-15555-0_26" target="_blank" title="It opens in new window">CrossRefa>
17. Tighe, J., Lazebnik, S.: Finding Things: Image Parsing with Regions and Per-Exemplar Detectors. In: CVPR, pp. 3001鈥?008 (June 2013)
18. Torralba, A., Russel, B.C., Yuen, J.: LabelMe: Online Image Annotation and Applications. Proceedings of the IEEE聽98(8), 1467鈥?484 (2010) <a class="external" href="http://dx.doi.org/10.1109/JPROC.2010.2050290" target="_blank" title="It opens in new window">CrossRefa>
19. Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.: Large margin methods for structured and interdependent output variables. JMLR聽6, 1453鈥?484 (2006)
20. Vezhnevets, A., Ferrari, V., Buhmann, J.M.: Weakly Supervised Semantic Segmentation with a Multi-Image Model. In: ICCV, Barcelona, ES (2011)
21. Vezhnevets, A., Ferrari, V., Buhmann, J.M.: Weakly Supervised Structured Output Learning for Semantic Segmentation. In: CVPR, Providence, RI (2012)
22. Yao, J., Fidler, S., Urtasun, R.: Describing the scene as a whole: Joint object detection, scene classification and semantic segmentation. In: CVPR (June 2012)
23. Yu, C.N.J., Joachims, T.: Learning structural SVMs with latent variables. In: ICML, Montreal, Canada (2009)
24. Yuille, A., Rangarajan, A.: The concave-convex procedure (CCCP). In: NIPS (2002)
作者单位：Roman Shapovalov (19)
Dmitry Vetrov (19)
Anton Osokin (19) (20)
Pushmeet Kohli (21)

19. Lomonosov Moscow State University, Russia
20. INRIA 鈥?SIERRA Project Team, Paris, France
21. Microsoft Research, Cambridge, UK
丛书名：Energy Minimization Methods in Computer Vision and Pattern Recognition
ISBN：978-3-319-14612-6
刊物类别：Computer Science
刊物主题：Artificial Intelligence and Robotics
Computer Communication Networks
Software Engineering
Data Encryption
Database Management
Computation by Abstract Devices
Algorithm Analysis and Problem Complexity
出版者：Springer Berlin / Heidelberg
ISSN：1611-3349

文摘

Structured-output learning is a challenging problem; particularly so because of the difficulty in obtaining large datasets of fully labelled instances for training. In this paper we try to overcome this difficulty by presenting a multi-utility learning framework for structured prediction that can learn from training instances with different forms of supervision. We propose a unified technique for inferring the loss functions most suitable for quantifying the consistency of solutions with the given weak annotation. We demonstrate the effectiveness of our framework on the challenging semantic image segmentation problem for which a wide variety of annotations can be used. For instance, the popular training datasets for semantic segmentation are composed of images with hard-to-generate full pixel labellings, as well as images with easy-to-obtain weak annotations, such as bounding boxes around objects, or image-level labels that specify which object categories are present in an image. Experimental evaluation shows that the use of annotation-specific loss functions dramatically improves segmentation accuracy compared to the baseline system where only one type of weak annotation is used.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700