Transfer learning with open Web data.

详细信息

作者：Xiang ; Wei.
学历：Doctor
年：2012
导师：Yang,Qiang,eadvisor
毕业院校：Hong Kong University of Science and Technology
Department：HKUST.
ISBN：9781267848512
CBH：3548956
Country：China
语种：English
FileSize：3340010
Pages：191

文摘

In recent years,transfer learning has been applied to a variety of real-world application domains,ranging from text classification,image classification,link prediction,activity recognition,to social network analysis. Transfer learning is particularly useful when we only have limited labeled data in a target domain,which requires that we consult one or more auxiliary or source domains to gain insight on how to solve the target problem. Thus,the key point for successful knowledge transfer is that one or more &ldquo；right&rdquo； source data should be given by the problem designer at the learning time. However,it is very difficult to identify a proper set of source data. An intuitive idea is whether we can directly seek the needed source data from the open Web. In this thesis,we try to study how to extend the existing transfer learning techniques to cope with the need for transfer learning from the massive and noisy Web data. The main contribution of this thesis is that we use two popular applications as prototypes and investigate their applicability and the difficulties in the Web-based transfer learning. We focus on tackling the following four research issues: (1) Transfer over information gap； (2) Transfer from heterogeneous data； (3) Transfer with partially labeled correspondence； (4) Selective transfer from massive and noisy sources. For each of the above mentioned issues,we first conduct extensive study on the difficulty of the problems,and then propose a series of effective solutions accordingly. Moreover,to cope with the need for manipulating the massive Web data as the source,we also investigate how to make our transfer learning models to be scalable with the assist of distributed computing techniques. We apply these methods to two diverse applications: text classification and link prediction,and achieve promising results. Experimental results show that our methods can successfully benefit from the truly useful information contained in the Web,while reducing the risks caused by massive and noisy property of the open Web to the minimum.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700