基于Jalangi的广告代码调用路径追踪
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Tracking Call Path of Online Advertisement Based on Jalangi
  • 作者:许蕾 ; 刘蕊成 ; 陈贵美 ; 赵晨 ; 张卫丰
  • 英文作者:XU Lei;LIU Rui-Cheng;CHEN Gui-Mei;ZHAO Chen;ZHANG Wei-Feng;Department of Computer Science and Technology, Nanjing University;School of Computer Science, Nanjing University of Posts & Telecommunication;
  • 关键词:动态插桩 ; 调用路径 ; 广告代码分析
  • 英文关键词:dynamically instrument;;call path;;ad code analysis
  • 中文刊名:RJXB
  • 英文刊名:Journal of Software
  • 机构:南京大学计算机科学与技术系;南京邮电大学计算机学院;
  • 出版日期:2019-07-15
  • 出版单位:软件学报
  • 年:2019
  • 期:v.30
  • 基金:国家重点基础研究发展计划(973)(2014CB340702);; 国家自然科学基金(61272080,91418202,61403187)~~
  • 语种:中文;
  • 页:RJXB201907015
  • 页数:15
  • CN:07
  • ISSN:11-2560/TP
  • 分类号:228-242
摘要
随着互联网的迅猛发展,网络广告成为互联网最重要的商业模式之一.网络广告在促进互联网发展的同时,也带来了用户信息泄露、影响用户网页浏览体验等负面问题.为了对网络广告进行系统的研究,需要获取广告生成过程中完整的调用路径.由于加载到页面中的JavaScript文件量大、函数调用路径链路长、路径中的JavaScript代码经过了一定的压缩和混淆,因此很难通过静态方法获取网络广告调用路径.分析了动态广告生成的过程,对相关代码进行动态插桩,利用函数参数实现广告调用信息的传递,并记录下每个iframe内部的调用信息,通过匹配与合并多个iframe的信息,生成了完整的广告调用路径并确定了广告插入的操作方式.针对21个真实网站进行了实验,结果表明:该方法能够在不太影响性能的前提下,获取到静态方法无法获取到的广告动态加载过程信息并生成广告代码调用路径.
        Online advertisement(short as ad) has become one of the most important business patterns, with the rapid development of Internet. Online advertisements are main economic sources of Web applications, but the negative affect is that ads may leak users' privacy,or increase loads of browsers' performance. In order to study online ads systematically, it is necessary to obtain a complete call path in the whole generating process. However, since the sizes of the loaded JavaScript files are usually large, the function call path is long, and even worse, the JavaScript code in the path is compressed and confused, it is difficult to get the call path of the online ads through static analysis method. This study tracks the call path of online ads dynamically, namely instruments the relevant codes at first, then uses the function parameters to transmit the call information and records the internal call information in each iframe, finally, by matching and merging the information in multiple iframes, a complete ad call path about the generating process of online ads is generated. The experiment focused on 21 real websites, and the results show that: the proposed method can obtain the dynamic loading information of ads and generate the whole call paths, which are impossible for static methods, and the overhead is acceptable.
引文
[1]Vogt P,Nentwich F,Jovanovic N,et al.Cross site scripting prevention with dynamic data tainting and static analysis.In:Proc.of the Network and Distributed System Security Symp.(NDSS 2007).San Diego:DBLP,2007.
    [2]Cova M,Kruegel C,Vigna G.Detection and analysis of drive-by-download attacks and malicious JavaScript code.In:Proc.of the Int’l Conf.on World Wide Web(WWW 2010).Raleigh:DBLP,2010.281-290.
    [3]Provos N,Mavrommatis P,Rajab MA,et al.All your iFRAMEs point to us.In:Proc.of the Conf.on Security Symp.USENIXAssociation,2008.1-15.
    [4]Zhou L,Zhang KH,Xie YL,Yu F,Wang XF.Knowing your enemy:Understanding and detecting malicious Web advertising.In:Proc.of the 19th ACM Conf.on Computer and Communications Security(CCS 2012).2012.674-686.
    [5]Sen K,Kalasapur S,Brutch T,Gibbs S.Jalangi:A selective record-replay and dynamic analysis framework for JavaScript.In:Proc.of the 9th Joint Meeting on Foundations of Software Engineering(FSE 2013).ACM,2013.488-498.
    [6]Nicol G,Wood L,Champion M,Byrne S.Document objectmodel(DOM)level 3 core specification.W3C Working Draft,2001,13:1-146.
    [7]Ocariza F,Bajaj K,Pattabiraman K,Mesbah A.An empirical study of client-side JavaScript bugs.In:Proc.of the Int’l Symp.on Empirical Software Engineering and Measurement(ESEM 2013).IEEEComputer Society,2013.55-64.
    [8]Jones J,Harrold M.Empirical evaluation of the tarantula automatic fault-localization technique.In:Proc.of the 20th IEEE/ACMInt’l Conf.on Automated Software Engineering(ASE 2005).ACM,2005.273-282.
    [9]Abreu R,Zoeteweij P,Gemund A.Spectrum-based multiple fault localization.In:Proc.of the 2009 IEEE/ACM Int’l Conf.on Automated Software Engineering(ASE 2009).IEEE,2009.88-99.
    [10]Agrawal H,Horgan J,London S,Wong W.Fault localization using execution slices and dataflow tests.In:Proc.of the 6th Int’l Symp.on Software Reliability Engineering(ISSRE 1995).IEEE,1995.143-151.
    [11]Cleve H,Zeller A.Locating causes of program failures.In:Proc.of the 27th Int’l Conf.on Software Engineering(ICSE 2005).ACM,2005.342-351.
    [12]Pradel M,Schuh P,Sen K.Typedevil:Dynamic type inconsistency analysis for JavaScript.In:Proc.of the Int’l Conf.on Software Engineering(ICSE 2015).2015.314-324.
    [13]Gong L,Pradel M,Sen K.Jitprof:Pinpointing jit-unfriendly JavaScript code.Technical Report,UCB/EECS-2014-144,University of California at Berkeley,2014.
    [14]Gong L,Pradel M,Sridharan M,Sen K.DLint:Dynamically checking bad coding practices in JavaScript.In:Proc.of the 2015 Int’l Symp.on Software Testing and Analysis(ISSTA 2015).2015.94-105.
    [15]Ocariza Jr.FS,Li GP,Pattabiraman K,Mesbah A.Automatic fault localization for client-side JavaScript.Software Testing,Verification and Reliability,2016,26:69-88.
    [16]Ocariza F,Bajaj K,Pattabiraman K,Mesbah A.An empirical study of client-side JavaScript bugs.In:Proc.of the Int’l Symp.on Empirical Software Engineering and Measurement(ESEM 2013).2013.55-64.
    [17]Ye JB,Zhang C,Ma L,Yu HB,Zhao JJ.Efficient and precise dynamic slicing for client-side JavaScript programs.In:Proc.of the23rd IEEE Int’l Conf.on Software Analysis,Evolution,and Reengineering(SANER 2016).2016.449-459.
    [18]The 2015 AdBlocking Report.2015.http://blog.pagefair.com/2015/ad-blocking-report/
    [19]Adblock Plus.https://adblockplus.org/
    [20]Evidon,Inc.,Ghostery.2012.http://www.ghostery.com/
    [21]Easy Blog-EasyList statistics:August 2011.2011.https://easylist.adblockplus.org/blog/2011/09/01/easylist-statistics:-august-2011
    [22]Orr CR,Chauhan A,Gupta M,Frisz CJ,Dunn CW.An approach for identifying JavaScript-loaded advertisements through static program analysis.In:Proc.of the 11th Annual ACM Workshop on Privacy in the Electronic Society(WPES 2012).2012.1-11.
    [23]Wang WH,Zheng YH,Xing XY,Kwon YW,Zhang XY,Eugste P.WebRanz:Web page randomization for better advertisement delivery and Web-bot prevention.In:Proc.of the ACM SIGSOFT Int’l Symp.on the Foundations of Software Engineering(FSE2016).2016.205-216.
    [24]Chen GM.Dynamic advertisement analysis method and replay technology research[MS.Theisi].Nanjing:Nanjing University of Posts and Telecommunications,2018(in Chinese with English abstract).
    [24]陈贵美.动态广告分析方法与重现技术研究[硕士学位论文].南京:南京邮电大学,2018.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700