摘要
基于脆弱性代码的相似度匹配是静态发现脆弱性的有效方法之一,在不降低漏报率的情况下如何降低误报率和提升分析效率是该方法优化的主要目标。针对这一挑战,提出了基于代码切片相似度匹配的脆弱性发现框架。文中研究了基于关键点的代码切片、特征抽取和向量化的方法,主要思想是以脆弱性代码的脆弱性语义上下文切片作为参照物,通过计算被测代码的切片与脆弱性样本切片的相似性来判断存在脆弱性的可能性。文中实现了该方法,并以已知脆弱性的开源项目为分析对象进行了验证。与已有研究的对比实验表明,切片相似度能更准确地刻画脆弱性上下文,通过切片技术优化了基于相似度匹配的脆弱性发现方法,有效降低了脆弱性发现的误报率和漏报率,验证了所提框架和方法的有效性。
Vulnerability analysis method based on similarity matching is one of the most effective vulnerability analysis methods.How to reduce the false positive rate without reducing the false negative rate,and increase the efficiency,are the main goals to optimize the method.Aiming at these challenges,this paper proposed an optimization vulnerability analysis framework based on similarity matching of program slices.In the framework,the methods of code slice,feature extraction and vectorization based on vulnerable key points were studied.The core ideal of the framework is taking vulnerability semantic context slice of the vulnerability code as a reference to calculate the similarity between the slice of the tested code and the vulnerable sample slice,and determining the likelihood of a vulnerability.This paper implemented this framework and validateed it with open source projects with known vulnerabilities.Compared with the existing research,the vulnerability slice similarity framework has the ability to close describe the vulnerability context,and the vulnerability discovery method based on similarity matching is optimized by the slice technique.The proposed framework and method are verified to effectively reduce the false positive rate and false negative rate of vulnerabi-lity discovery.
引文
[1] WU S Z,GUO T,DONG G W,et al.Software vulnerability analyses:A road map [J].Journal of Tsinghua University (Science and Technology),2012,52(10):1309-1319.(in Chinese)吴世忠,郭涛,董国伟,等.软件漏洞分析技术进展[J].清华大学学报(自然科学版),2012,52(10):1309-1319.
[2] WIJAYASEKARA D,MANIC M,WRIGHT J,et al.Mining bug databases for unidentified software vulnerabilities[C]//2012 5th International Conference on Human System Interactions (HSI).IEEE,2012.
[3] NEUHAUS S,ZIMMERMANN T,HOLLER C,et al.Predicting vulnerable software components[C]//Proceedings of the 14th ACM Conference on Computer and Communications Security.ACM,2007.
[4] WEISER M.Program slicing[C]//Proceedings of the 5th International Conference on Software Engineering.IEEE Press,1981:439-449.
[5] KOREL B,LASKI J.Dynamic program slicing[J].Information processing letters,1988,29(3):155-163.
[6] GALLAGHER K B,LYLE J R.Using program slicing in software maintenance[J].IEEE Transactions on Software Enginee-ring,1991,17(8):751-761.
[7] HIERONS R,HARMAN M,DANICIC S.Using program sli- cing to assist in the detection of equivalent mutants[J].Software Testing Verification and Reliability,1999,9(4):233-262.
[8] VENKATESH G A.The semantic approach to program slicing[C]//ACM SIGPLAN Notices.ACM,1991,26(6):107-119.
[9] FIELD J,RAMALINGAM G,TIP F.Parametric program sli- cing[C]//Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages.ACM,1995:379-392.
[10] CANFORA G,CIMITILE A,DE LUCIA A.Conditioned program slicing[J].Information and Software Technology,1998,40(11-12):595-607.
[11] AGRAWAL H,DEMILLO R A,SPAFFORD E H.Debugging with dynamic slicing and backtracking[J].Software:Practice and Experience,1993,23(6):589-616.
[12] OTTENSTEIN K J,OTTENSTEIN L M.The program depen- dence graph in a software development environment[C]//ACM Sigplan Notices.ACM,1984,19(5):177-184.
[13] DANICIC S,HARMAN M,SIVAGURUNATHAN Y.A parallel algorithm for static program slicing[J].Information Proces-sing Letters,1995,56(6):307-313.
[14] BERGERETTI J F,CARRé B A.Information-flow and data-flow analysis of while-programs[J].ACM Transactions on Programming Languages and Systems (TOPLAS),1985,7(1):37-61.
[15] ZHANG X J.Program slicing technology research and slicing scheme design [D].Chengdu:University of Electronic Science and Technology,2017.(in Chinese)张新杰.程序切片技术研究及切片方案设计[D].成都:电子科技大学,2017.
[16] XU B,QIAN J,ZHANG X,et al.A brief survey of program slicing[J].ACM SIGSOFT Software Engineering Notes,2005,30(2):1-36.
[17] TIP F.A survey of program slicing techniques[M].Centrum voor Wiskunde en Informatica,1994.
[18] 李必信.程序切片技术及其应用[M].北京:科学出版社,2006.
[19] Wisconsin Program-Slicing Project.The Wisconsin Program-Slicing Tool,Version 1.0.1[OL].http://www.cs.wisc.edu/wpis/slicing_tool.
[20] BALAKRISHNAN G,GRUIAN R,REPS T,et al.CodeSurfer/x86-A platform for analyzing x86 executables[C]//International Conference on Compiler Construction.Springer,Berlin,Heidelberg,2005:250-254.
[21] CUOQ P,KIRCHNER F,KOSMATOV N,et al.Framac[C]//International Conference on Software Engineering and Formal Methods.Springer,Berlin,Heidelberg,2012:233-247.
[22] SCHWARTZ-NARBONNE D,OH C,SCH?F M,et al.VERMEER:A tool for tracing and explaining faulty C programs[C]//2015 IEEE/ACM 37th IEEE International Conference on Software Engineering (ICSE).IEEE,2015:737-740.
[23] ALOMARI H W,COLLARD M L,MALETIC J I,et al.srcSlice:very efficient and scalable forward static slicing[J].Journal of Software:Evolution and Process,2014,26(11):931-961.
[24] NEWMAN C D,SAGE T,COLLARD M L,et al.srcSlice:a tool for efficient static forward slicing[C]//IEEE/ACM Internatio-nal Conference on Software Engineering Companion (ICSE-C).IEEE,2016:621-624.
[25] LI R Q,ZENG G B.Slice Abstract Extraction in Program Source Code and Its Application in Search[J].Information Technology & Network Security,2018,37(3):122-125.(in Chinese)李润青,曾国荪.程序源代码中的切片摘要提取及在搜索中的应用[J].信息技术与网络安全,2018,37(3):122-125.
[26] FENG Q,WANG M,ZHANG M,et al.Extracting conditional formulas for cross-platform bug search[C]//ASIACCS.2017.
[27] YAMAGUCHI F,FELIX L,KONRAD R.Vulnerability extra- polation:Assisted discovery of vulnerabilities using machine learning[C]//Proceedings of the 5th USENIX Conference on Offensive Technologies.USENIX Association,2011.
[28] YAMAGUCHI F,MARKUS L,KONRAD R.Generalized vulnerability extrapolation using abstract syntax trees[C]//Proceedings of the 28th Annual Computer Security Applications Conference.ACM,2012.
[29] YAMAGUCHI F,et al.Chucky:exposing missing checks in source code for vulnerability discovery[C]//Proceedings of the 2013 ACM SIGSAC Conference on Computer & Communications Security.ACM,2013.
[30] YAMAGUCHI F,MAIER A,GASCON H,et al.Automatic inference of search patterns for taint-style vulnerabilities[C]//2015 IEEE Symposium on Security and Privacy (SP).IEEE,2015:797-812.
[31] XU X,LIU C,FENG Q,et al.Neural Network-based Graph Embedding for Cross-Platform Binary Code Similarity Detection[C]//Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security.ACM,2017:363-376.
[32] GAN S T,QIN X J,CHEN Z N,et al.Software vulnerability code clone detection method based on characteristic metrics[J].Journal of Software,2015,26(2):348-363 (in Chinese).甘水滔,秦晓军,陈左宁,等.一种基于特征矩阵的软件脆弱性代码克隆检测方法[J].软件学报,2015,26(2):348-363.
[33] https://samate.nist.gov/SRD/testsuites/juliet/Juliet_Test_Sui-te_v1.3_for_C_Cpp.zip.
1)http://ediss.uni-goettingen.de/handle/11858/00-1735-0000-0023-9682-0.
2)https://github.com/octopus-platform/joern.