摘要
为了更有效地挖掘用户、上下文和广告之间的三维交互关系,张量分解模型开始被用于解决实时竞价广告响应预测问题.然而实时竞价广告响应预测面临严峻的数据稀疏和冷启动问题,尤其是广告转化率预测,单纯地依靠某类或某些信息很难有效地解决这些问题,只有同时综合利用各种各样的异质、异构信息才能有效地应对这些问题.本文面向张量分解模型,提出了基于异构信息融合的综合解决方案来解决数据稀疏问题.该方案针对不同信息的性能、类型、结构、存在方式和作用特点等,提出了不同的融合策略和不同的实现方法,提升了基于张量分解模型的广告响应预测方法的可靠性和准确性,有效地缓解了需求方平台进行广告响应预测时面临的严峻数据稀疏问题.在选定数据集上基于异构信息融合的模型预测性能与基准方法相比取得了显著的提升.
In recent years, the tensor factorization model has been used to model complicated feature interactions involving multiple aspects, such as the user, publisher, and advertiser, for response prediction in real-time bidding. Among numerous challenges, data sparsity and cold start problems always bother researchers, particularly for ad conversion rate prediction. Such problems in prediction become difficult if only one or several types of information are considered. All types of heterogeneous information must be simultaneously integrated to address these problems. This paper proposes an availability solution for integrating heterogeneous information in the tensor factorization model to efficiently alleviate data sparsity and cold start problems. It proposes different integration strategies and implementation methods for various types of information depending on their property,category, structure, form, and function. This solution efficiently alleviates data sparsity and cold start problems, and enhances the prediction reliability and precision for the tensor factorization model in real-time bidding systems. Finally, this solution achieves a significant improvement in response prediction compared to baselines methods on the selection datasets.
引文
1 McMahan H B,Holt G,Sculley D,et al.Ad click prediction:a view from the trenches.In:Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,Chicago,2013.1222-1230
2 Graepel T,Candela J Q,Borchert T,et al.Web-scale bayesian clickthrough rate prediction for sponsored search advertising in microsoft’s bing search engine.In:Proceedings of the 27th International Conference on Machine Learning(ICML-10),New York,2010.13-20
3 Alekh A,Olivier C,Miroslav D,et al.A reliable effective terascale linear learning system.J Mach Learn Res,2014,15:1111-1133
4 Chapelle O,Manavoglu E,Rosales R.Simple and scalable response prediction for display advertising.ACM Trans Intel Syst Technol,2015,5:1-34
5 Wu W C H,Yeh M Y,Chen M S.Predicting winning price in real time bidding with censored data.In:Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,Sydney,2015.1305-1314
6 Li C,Lu Y,Mei Q Z,et al.Click-through prediction for advertising in twitter timeline.In:Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,Sydney,2015.1959-1968
7 Menon A K,Chitrapura K P,Garg S,et al.Response prediction using collaborative filtering with hierarchies and side-information.In:Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,San Diego,2011.141-149
8 Wu K W,Ferng C S,Ho C H,et al.A two-stage ensemble of diverse models for advertisement ranking in KDD Cup2012.https://www.csie.ntu.edu.tw/~htlin/paper/doc/wskdd12cup.pdf
9 Li S,Kawale J,Fu Y.Predicting user behavior in display advertising via dynamic collective matrix factorization.In:Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval,Santiago,2015.875-878
10 Trofimov I,Kornetova A,Topinskiy V.Using boosted trees for click-through rate prediction for sponsored search.In:Proceedings of the 6th International Workshop on Data Mining for Online Advertising and Internet Economy,Beijing,2012
11 Agarwal D,Agrawal R,Khanna R,et al.Estimating rates of rare events with multiple hierarchies through scalable log-linear models.In:Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,Washington,2010.213-222
12 Zhang W N,Yuan S,Wang J.Real-time bidding benchmarking with iPinYou dataset.2014.ArXiv:1407.7073
13 Zou Y Q,Jin X,Li Y,et al.Mariana:tencent deep learning platform and its applications.Proc VLDB Endow,2014,7:1772-1777
14 Koren Y,Bell R,Volinsky C.Matrix factorization techniques for recommender systems.Computer,2009,42:30-37
15 Linden G,Smith B,York J.Amazon.com recommendations:item-to-item collaborative filtering.IEEE Int Comput,2003,7:76-80
16 Chen T Q,Tang L P,Liu Q,et al.Combining factorization model and additive forest for collaborative followee recommendation.2012.http://www.cs.princeton.edu/~linpengt/papers/kddcup2012.pdf
17 Symeonidis P,Nanopoulos A,Manolopoulos Y.Tag recommendations based on tensor dimensionality reduction.In:Proceedings of the 2008 ACM Conference on Recommender Systems,Lausanne,2008.43-50
18 Rendle S,Balby M L,Nanopoulos A,et al.Learning optimal ranking with tensor factorization for tag recommendation.In:Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,Paris,2009.727-736
19 Shen S,Hu B,Chen W Z,et al.Personalized click model through collaborative filtering.In:Proceedings of the 5th ACM International Conference on Web Search and Data Mining,Seattle,2012.323-332
20 Shan L L,Lin L,Shao D,et al.CTR prediction for DSP with improved cube factorization model from historical bidding log.In:Proceedings of International Conference on Neural Information Processing,Kuching,2014.17-24
21 Shan L L,Lin L,Sun C J,et al.Predicting ad click-through rates via feature-based fully coupled interaction tensor factorization.Electron Com Res Appl,2016,16:30-42
22 Shan L L,Lin L,Sun C J,et al.Optimizing ranking for response prediction via triplet-wise learning from historical feedback.Int J Mach Learn Cybern,2017,8:1777-1793
23 Lee K,Orten B,Dasdan A,et al.Estimating conversion rate in display advertising from past erformance data.In:Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,Beijing,2012.768-776
24 Oentaryo R J,Lim E P,Low J W,et al.Predicting response in mobile advertising with hierarchical importance-aware factorization machine.In:Proceedings of the 7th ACM International Conference on Web Search and Data Mining,New York,2014.123-132
25 Agarwal D,Broder A Z,Chakrabarti D,et al.Estimating rates of rare events at multiple resolutions.In:Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,San Jose,2007.16-25
26 Wang X R,Li W,Cui Y,et al.Click-through rate estimation for rare events in online advertising.In:Online Multimedia Advertising:Techniques and Technologies.Hershey:IGI Global,2010
27 Kota N,Agarwal D.Temporal multi-hierarchy smoothing for estimating rates of rare events.In:Proceedings of the17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,San Diego,2011.1361-1369
28 Vargiu E,Giuliani A,Armano G.Improving contextual advertising by adopting collaborative filtering.ACM Trans Web,2013,7:1-22
29 Dave K S,Varma V.Learning the click-through rate for rare/new Ads from similar Ads.In:Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval,Geneva,2010.897-898
30 Agarwal D,Chen B C,Elango P.Spatio-temporal models for estimating click-through rate.In:Proceedings of the18th International Conference on World Wide Web,Madrid,2009.21-30
31 Regelson M,Fain D.Predicting click-through rate using keyword clusters.In:Proceedings of the 2nd Workshop on Sponsored Search Auctions.New York:ACM,2006
32 Richardson M,Dominowska E,Ragno R.Predicting clicks:estimating the click-through rate for new ADs.In:Proceedings of the 16th International Conference on World Wide Web,Banff,2007.521-530
33 Kolesnikov A,Logachev Y,Topinskiy V.Predicting CTR of new Ads via click prediction.In:Proceedings of the 21st ACM International Conference on Information and Knowledge Management,Maui,2012.2547-2550
34 Cheng H,Zwol R V,Azimi J,et al.Multimedia features for click prediction of new Ads in display advertising.In:Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,Beijing,2012.777-785
35 Koren Y.Factorization meets the neighborhood:a multifaceted collaborative filtering model.In:Proceedings of the14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,Las Vegas,2008.426-434
36 Menon A K,Elkan Charles.A log-linear model with latent features for dyadic prediction.In:Proceedings of the 10th International Conference on Data Mining,Piscataway,2010.364-373
37 Yang S H,Long B,Smola A,et al.Like like alike:joint friendship and interest propagation in social networks.In:Proceedings of the 20th International Conference on World Wide Web,Hyderabad,2011.537-546
38 Chen T Q,Zheng Z,Lu Q X,et al.Feature-based matrix factorization.2011.ArXiv:1109.2271
39 Yan L,Li W J,Xue G R,et al.Coupled group lasso for web-scale CTR prediction in display advertising.In:Proceedings of the 31st International Conference on Machine Learning(ICML-14),Beijing,2014.802-810
40 Tagami Y,Ono S,Yamamoto K,et al.CTR prediction for contextual advertising:learning-to-rank approach.In:Proceedings of the 7th International Workshop on Data Mining for Online Advertising,Chicago,2013
41 Rendle S,Freudenthaler C,Gantner Z,et al.BPR:Bayesian personalized ranking from implicit feedback.In:Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence,Montreal,2009.452-461
42 Liao H,Peng L X,Liu Z C,et al.i PinYou global RTB bidding algorithm competition dataset.In:Proceedings of the8th International Workshop on Data Mining for Online Advertising,New York,2014
43 Zhang W N,Yuan S,Wang J.Optimal real-time bidding for display advertising.In:Proceedings of the 20th ACMSIGKDD International Conference on Knowledge Discovery and Data Mining,New York,2014.1077-1086
44 Zhang W N,Wang J.Statistical arbitrage mining for display advertising.In:Proceedings of the 21th ACM SIGKDDInternational Conference on Knowledge Discovery and Data Mining,Sydney,2015.1465-1474
45 Hanley J A,McNeil B J.The meaning and use of the area under a receiver operating characteristic(ROC)curve.Radiology,1982,143:29-36
46 Bradley A P.The use of the area under the ROC curve in the evaluation of machine learning algorithms.Pattern Recogn,1997,30:1145-1159
47 Fawcett T.ROC graphs:notes and practical considerations for researchers.Mach Learn,2004,31:1-38