基于内存对象访问序列动态胎记的程序同源性判别方法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Software Homology Detection with Dynamic Birthmarks Based on Memory Object Access Sequences
  • 作者:陈铜 ; 赵磊 ; 王丽娜 ; 汪润
  • 英文作者:CHEN Tong;ZHAO Lei;WANG Lina;WANG Run;Key Laboratory of Aerospace Information Security and Trusted Computing,Ministry of Education,Wuhan University;School of Cyber Science and Engineering,Wuhan University;
  • 关键词:软件胎记 ; 内存对象 ; 程序同源性判别
  • 英文关键词:software birthmarks;;memory object;;software homology detection
  • 中文刊名:WHDY
  • 英文刊名:Journal of Wuhan University(Natural Science Edition)
  • 机构:武汉大学空天信息安全与可信计算教育部重点实验室;武汉大学国家网络安全学院;
  • 出版日期:2019-03-11 14:39
  • 出版单位:武汉大学学报(理学版)
  • 年:2019
  • 期:v.65;No.294
  • 基金:国家自然科学基金(61876134,61672394);; 国家重点研发计划(2016YFB0801100);; 国家自然科学联合基金重点支持项目(U1536204)
  • 语种:中文;
  • 页:WHDY201902007
  • 页数:10
  • CN:02
  • ISSN:42-1674/N
  • 分类号:78-87
摘要
针对现有二进制程序同源性判别方法受限于特定编程语言或环境、难以应对复杂的代码混淆攻击、易受依赖库影响等问题,提出了一种基于内存对象访问序列动态胎记(dynamic birthmarks based on memory object access sequences, DBMOAS)的程序同源性判别方法。该方法将程序对数据结构的访问顺序流作为程序语义的一种鲁棒性特征并加以分析,能较好地应对复杂的代码混淆攻击;基于动态污点分析,表征程序的数据结构,解决了二进制程序缺少数据结构与类型的语义表示问题。为验证DBMOAS方法的可信性和弹性,在窗口大小取值不同的情况下,测试具有相似功能的独立程序间的相似度;针对不同编译器、编译选项、混淆方法、版本迭代产生的同源样本,测试程序间的相似度。实验结果表明,本文方法能有效判别程序间的同源性,可信性评估中误判率仅为6. 7%,弹性评估中无漏判情况。
        A method called DBMOAS(dynamic birthmarks based on memory object access sequences) for identifying software ho-mology is proposed to solve the problems that the existing methods have limitation on specific programming languages or environments and are difficult to deal with complex code obfuscation attacks and susceptible to dependent libraries. This method regards the access sequence of the data structure in the program as a robust feature of the program semantics and then analyzes it, which does well with complex code obfuscation attacks. A data structure representation method based on dynamic taint analysis is designed to solve the problem of lacking the semantic representation of data structure and type in binary programs. To verify the credibility and resilience of DBMOAS method, the similarity between independent programs with similar functions was tested under different window sizes, and the similarity between programs was also tested for homologous samples generated by different compilers, compilation options, obfuscation methods and version iteration. The experimental results show that the proposed method can effectively identify the homology of the programs and the false positive rate in the credibility test is only 6.7% without false negatives in the resilience test.
引文
[1]肖云倡,苏海峰,钱雨村,等.一种基于行为的Android恶意软件家族聚类方法[J].武汉大学学报(理学版),2016,62(5):429-436.DOI:10.14188/j.1671-8836.2016.05.005.XIAO Y C,SU H F,QIAN Y C,et al.A behaviorbased family clustering method for Android malwares[J].Journal of Wuhan University(Natural Science Edition),2016,62(5):429-436.DOI:10.14188/j.1671-8836.2016.05.005(Ch).
    [2]LI M H,WANG W,WANG P,et al.LibD:Scalable and precise third-party library detection in android markets[C]//IEEE/ACM International Conference on Software Engineering.Washington D C:IEEE Press,2017:335-346.DOI:10.1109/ICSE.2017.38.
    [3]田振洲,刘烃,郑庆华,等.软件抄袭检测研究综述[J].信息安全学报,2016,1(3):52-76.DOI:10.19363/j.cnki.cn10-1380/tn.2016.03.005.TIAN Z Z,LIU T,ZHENG Q H,et al.Software plagiarism detection:A survey[J].Journal of Cyber Security,2016,1(3):52-76.DOI:10.19363/j.cnki.cn10-1380/tn.2016.03.005(Ch).
    [4]HAMILTON J,DANICIC S.An evaluation of the resilience of static Java bytecode watermarks against distortive attacks[J].IAENG International Journal of Computer Science,2011,38(1):1-15.
    [5]LUO L N,MING J,WU D H,et al.Semantics-based obfuscation-resilient binary code similarity comparison with applications to software plagiarism detection[C]//ACMSigsoft International Symposium on Foundations of Software Engineering.New York:ACM Press,2014:389-400.DOI:10.1145/2635868.2635900.
    [6]SHI G Y.Static software birthmark based on multiple attributes[C]//International Conference on Mechanical,Electronic,Control and Automation Engineering.Pairs:Atlantis Press,2017:403-407.DOI:10.2991/mecae-17.2017.76.
    [7]NAZIR S,SHAHZAD S,RIZA L S.Birthmark-based software classification using rough sets[J].Arabian Journal for Science&Engineering,2017:42(2):859-871.DOI:10.1007/s13369-016-2371-4.
    [8]CHAN P P F,HUI L C K,YIU S M.Heap graph based software theft detection[J].IEEE Transactions on Information Forensics&Security,2013,8(1):101-110.DOI:10.1109/tifs.2012.2223685.
    [9]KIM D,CHO S J,HAN S,et al.Open source software detection using function-level static software birthmark[J].Journal of Internet Services and Information Security,2014,4(4):25-37.
    [10]LIM H I,PARK H,CHOI S,et al.A static java birthmark based on control flow edges[C]//2009 33rd Annual IEEE International,Computer Software and Applications Conference.Washington D C:IEEE Press,2009:413-420.DOI:10.1109/COMPSAC.2009.62.
    [11]FUKUDA K,TAMADA H.A dynamic birthmark from analyzing operand stack runtime behavior to detect copied software[C]//2013 14th ACIS International Conference on Software Engineering,Artificial Intelligence,Networking and Parallel/Distributed Computing.Washington D C:IEEE Press,2013:505-510.DOI:10.1109/SNPD.2013.11.
    [12]CHAN P P F,HUI L C K,YIU S M.Dynamic Software Birthmark for Java Based on Heap Memory Analysis[M].Berlin:Springer,2011.DOI:10.1007/978-3-642-24712-5_8.
    [13]WANG X R,JHI Y C,ZHU S C,et al.Detecting software theft via system call based birthmarks[C]//2009 Annual Computer Security Applications Conference.Washington DC:IEEE Press,2009:149-158.DOI:10.1109/ACSAC.2009.24.
    [14]KO C H,PARK D S,HONG J.An efficient similarity measurement technique using dynamic birthmark based on API[J].Information Japan,2016,19(11):5235-5244.
    [15]KIM D,GOKHALE A,GANAPATHY V,et al.Detecting plagiarized mobile apps using API birthmarks[J].Automated Software Engineering,2016,23(4):591-618.DOI:10.1007/s10515-015-0182-6.
    [16]TIAN Z Z,ZHENG Q H,LIU T,et al.DKISB:Dynamic key instruction sequence birthmark for software plagiarism detection[C]//2013 IEEE International Conference on High Performance Computing and Communications&2013 IEEEInternational Conference on Embedded and Ubiquitous Computing.Washington D C:IEEE Press,2013:619-627.DOI:10.1109/HPCC.and.EUC.2013.93.
    [17]范铭,刘烃,郑庆华,等.基于栈行为动态胎记的软件抄袭检测方法[J].山东大学学报(理学版),2014,49(9):9-16.DOI:10.6040/j.issn.1671-9352.2.2014.123.FAN M,LIU T,ZHENG Q H,et al.SODB:A novel method for software plagiarism detection based on stack operation dynamic birthmark[J].Journal of Shandong University(Natural Science),2014,49(9):9-16.DOI:10.6040/j.issn.1671-9352.2.2014.123(Ch).
    [18]PARK H,CHOI S,LIM H I,et al.Detecting code theft via a static instruction trace birthmark for Java methods[C]//IEEE International Conference on Industrial Informatics.Washington D C:IEEE Press,2008:551-556.DOI:10.1109/INDIN.2008.4618162.
    [19]LIM H I,HAN T.Analyzing stack flows to compare java programs[J].IEICE Transactions on Information&Systems,2012,95(2):565-576.DOI:10.1587/transinf.E95.D.565.
    [20]SCHULER D,DALLMEIER V,LINDIG C.A dynamic birthmark for Java[C]//Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering.New York:ACM Press,2007:274-283.DOI:10.1145/1321631.1321672.
    [21]PARK H,CHOI S,LIM H I,et al.Detecting java theft based on static api trace birthmark[C]//International Workshop on Security.Berlin:Springer,2008:121-135.DOI:10.1007/978-3-540-89598-5_8.
    [22]赵玉洁,汤战勇,王妮,等.代码混淆算法有效性评估[J].软件学报,2012,23(3):700-711.DOI:10.3724/SP.J.1001.2012.03994.ZHAO Y J,TANG Z Y,WANG N,et al.Evaluation of code obfuscating transformation[J].Journal of Software,2012,23(3):700-711.DOI:10.3724/SP.J.1001.2012.03994(Ch).
    [23]WANG R,LIU P,ZHAO L,et al.deExploit:Identifying misuses of input data to diagnose memory-corruption exploits at the binary level[J].Journal of Systems&Software,2017,124:153-168.DOI:10.1016/j.jss.2016.11.026.
    [24]LIU Z Y,CRISWELL J.Flexible and efficient memory object metadata[J].ACM Sigplan Notices,2017,52(9):36-46.DOI:10.1145/3092255.3092268.
    [25]LUK C K,COHN R,MUTH R,et al.Pin:Building customized program analysis tools with dynamic instrumentation[J]//ACM Sigplan Notices,2005,40(6):190-200.DOI:10.1145/1064978.1065034.
    [26]任翔宇,谈诚,赵磊,等.识别数据结构的协议格式逆向推理方法[J].武汉大学学报(工学版),2015,48(2):269-273,288.DOI:10.14188/j.1671-8844.2015-02-025.REN X Y,TAN C,ZHAO L,et al.Reverse engineering of protocol format via identifying program data structures[J].Engineering Journal of Wuhan University,2015,48(2):269-273,288.DOI:10.14188/j.1671-8844.2015-02-025(Ch).
    [27]MYLES G,COLLBERG C.k-gram based software birthmarks[C]//Proceedings of the 2005 ACM Symposium on Applied Computing.New York:ACM Press,2005:314-318.DOI:10.1145/1066677.1066753.
    [28]TIAN Z Z,ZHENG Q H,LIU T,et al.Software plagiarism detection with birthmarks based on dynamic key instruction sequences[J].IEEE Transactions on Software Engineering,2015,41(12):1217-1235.DOI:10.1109/TSE.2015.2454508.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700