Java程序指向分析研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
指针指向分析的主要目的是静态地获取程序在运行时刻的指针指向信息。本文设计并实现了一种上下文敏感的基于约束的Java程序指向分析算法。在此基础上,本文提出了几项针对指向分析的优化技术,以此提高算法的效率和实用性。
     本文基于Andersen算法,设计了一种有效的上下文敏感的指针指向分析算法,该算法支持继承、字段对象等语言特性。不同对象的字段在算法中被分别处理,同时,算法对复合类型的对象实现了基于字段的处理。本文用有向图描述各个方法内部的指向关系,并以此为基础计算得到带上下文信息的全局指向图。
     为了提高算法的效率和可扩展性,本文首先引入了两种优化技术:结点拓扑排序和回路侦测与消除。结点间的拓扑排序的目标是降低分析过程中的迭代次数;另一个是在线的回路侦测与消除,它与拓扑排序过程同步实现。实验数据表明,本文提出的优化技术有效提高了算法的效率。
     此外,本文引入了指向副作用分析技术,这项技术可以进一步提高上下文敏感的指向分析算法的效率。指向副作用分析通过对局部指向图的分析找出程序中无指向副作用的方法集合。无指向副作用方法指的是那些不会改变其调用上下文中的指向关系的方法。在全局指向关系的分析过程中利用无指向副作用分析的结果可以避免计算一些无关的指向关系。实验数据表明,利用指向副作用分析结果可以进一步提高指向分析的效率。
Points-to analysis mainly aims at getting the runtime points-to sets of program variables. This thesis designs and implements a context-sensitive constraint-based points-to analysis algorithm for Java programs. In addition, this thesis proposes several opitimizations which can be used to accerlerate the algorithm.
     This thesis describes the design and implementation of an efficient Andersen-style, context-sensitive points-to analysis algorithm for Java code. Our algorithm supports language features like inheritance, polymorphism, and field objects. We track the fields of individual objects separately and make the algorithm in field-sensitive style for aggregate objects. This algorithm first summarizes methods of the program under analysis using directed graphs. The main analysis algorithm uses these graphs to construct the main points-to graph.
     To improve the efficiency and scalability of the algorithm, this thesis employs two kinds of optimizations, nodes topology construction with concomitance on-line cycle detection and elimination. We perform cycle elimination on points-to graphs to reduce their sizes. Topological sort is performed simultaneously on the nodes of graph to speed up the transitive closure computationon in the main points-to graph. Experiment result shows that the efficiency of the algorithm is notably raised by using these two optimizations.
     This thesis also introduces a method which uses points-to side-effect(PSE) analysis to accelerate context sensitive points-to analysis. A PSE-free method has no mutation on points-to relation of its calling context. So it can be skipped during points-to analysis. Our method first finds PSE-free methods in the program by performing PSE analysis on the local points-to graphes of the methods. Based on these PSE analysis results, the main points-to analysis algorithm efficiently computes the points-to graph by skipping PSE-free methods. Experiment result shows that the efficiency of the algorithm can be further improved by using PSE analysis.
引文
1. Jones C. Software productivity group,2003.
    2. Bloor Research. CAST Tools Report,1996.
    3. Hatton L. Balancing static and dynamic testing:some observations from measurement. Invited as part of visiting scientists'series, Nokia Research Labs, Helsinki,2000.
    4. Naughton J. F, Ramakrishnan R., Sagiv Y, and Ullman J. D. Efficient evaluation of right-, left-, and multi-linear rules. In SIGMOD'89:Proceedings of the 1989 ACM SIGMOD International Conference on Management of Data, pages 235-242. ACM Press,1989.
    5. Ramalingam G. The undecidability of aliasing. In:Andrew A, ed. ACM Trans.on Programming Language System, New York:ACM Press1994,16(5):14671471.
    6. Landi W. Undecidability of static analysis. In:Charles N, ed. ACM Letters onProgramming Languages and Systems, New Youk:ACMPress1992,1(4):323337.
    7. Ramalingam G.1994. The Undecidability of Aliasing. ACM Transactions on Programming Languages and Systems 16,5(Sept.),1467-1471.
    8. Larus J.R, and Hilringer P.N.1988. Detecting confilicts between structure accesses. In SIGPLAN'88 Conference on Programming Language Design and Implementation,21-34. SIGPLAN Notices.23(7).
    9. Hind M, Bruke M, Carini P, and Choi J.D. Interprocedral Pointer Alias Analysis. ACM Transactions on Programming Languages and Systems, Vol.21,No 4. July 1999, 848-894.
    10. Bodden E, Lam P, and Hendren L. Flow-sensivtive static optimizations for runtime monitors.abc TR. abc-2007-3, Sable Research Group, School of Computer Science, McGill University, July 18,2007.
    11. Emmi M., Fischer J, Jhala R. and Majumdar R. Lock allocation. In Proceedings of the 34th Annual ACM Symposium on the Principles of Programming Languages,2007.
    12. Minsky N. Towards alias-free pointers. In proc. of the European conference on Object-Oriented Programming,1996.189-209.
    13. Naik M. and Aiken A. Conditional must not aliasing of multithreaded Java programs. In proceedings of the ACM symposium on Applied Computing,2003,1068-1075.
    14. Andersen LO. Program analysis and specialization for the C programming language [Ph.D. Thesis]. DIEM:University of Copenhagen,1994.
    15. Liang D, Pennings M, Harrold MJ. Extending and evaluating flow-insensitive and context-insensitive points-to analyses for Java. In:John F, ed. Proc. of PASTE, New York: ACM Press 2001.73-79.
    16. Heintze N, Tardieu O. Ultra-Fast aliasing analysis using CLA:A million lines of C code. In:Michale B, ed. Proc. of PLDI, New York:ACM Press 2001.254-263.
    17. Whaley J, Lam M. An efficient inclusion-based points-to analysis for strictly-typed languages. In:Manuel H, ed. Proc. of the Static Analysis Symp. LNCS 2433, Berlin Heidelberg:Springer-Verlag 2002.180-195.
    18. Berndl M, Lhot'ak O, Qian F, Hendren L, Umanee N. Points to analysis using BDDs. In: Ron C, ed. Proc. of the Conf. on Programming Language Design and Implementation, New York:ACM Press 2003.103-114.
    19. Sridharan M, Gopan D, Shan LX, Bodik R. Demand-Driven points-to analysis for Java. In: Ralph J, ed. Proc. of OOPSLA, NewYork:ACM Press 2005.59-76.
    20. Steensgaards B. Points-to analysis in almost linear time. In:Hans-J B, ed. Proc. of the ACM Symp. on Principles of Programming Languages (POPL). New York:ACM Press, 1996.32-41.
    21. Das M. Unification-Based pointer analysis with directional assignments. In:Proc. of the ACM Conf. on Programming Language Design and Implementation (PLDI). New York: ACM Press,2000.35-46.
    22. Kahlon V. Bootstrapping:A technique for scalable flow and context-sensitive pointer alias analysis. In:Rajiv G, ed. Proc. of the PLDI, New York:ACM Press,2008.249-259.
    23. Ryder BG. Dimensions of precision in reference analysis ofobject-oriented programming languages.In:Hedin G,ed.Proc. of the CC 2003. LNCS 2622, Berlin Heidelberg: Springer-Verlag 2003.126-137.
    24. Liang D, Pennings M, Harrold MJ. Evaluating the impact of context-sensitivity on Andersen's algorithm for Java programs. In:Michale E, ed. Proc. of PASTE, New York: ACM Press 2005.6-12.
    25. Hind M, Pioli A. Which pointer analysis should I use? In:Mary J, ed. Proc. of ISSTA. New York:ACM Press,2000.113-123.
    26. Lhot'ak O, Hendren L. Context-Sensitive points-to analysis:Is it worth it? In:Alan M, ed. Proc. of the Int'l Conf. on Compiler Construction.LNCS 3923, Berlin Heidelberg: Springer-Verlag 2006.47-64.
    27. Kotzmann T. Escape Analysis in the Context of Dynamic Compilation and Deoptimization. Phd thesis. Johannes Kepler University Linz.2005.
    28. Ruggieri, C, and Murtagh T.P. Lifetime Analysis of Dynamically Allocated Objects. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages,285-293. San Diego, January 1988.
    29. Park Y.G, and Goldberg B. Escape Analysis on Lists. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation,116-127. San Francisco, June 1992.
    30. Deutsch A. On the Complexity of Escape Analysis. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages,358-371. Paris, January 1997.
    31. Choi J.D, Gupta M, Serrano M, Sreedhar VC, Midkiff S. Escape Analysis for Java. In Proceedings ofthe 14th Annual Conference on Object-Oriented Programming Systems, Languages and Applications,1-19,1999.
    32. Lee K, Xing F, and Midkiff S.P, Practical escape analyses:how good are they? Proceedings of the 3rd international conference on Virtual execution environments, 180-190,2007.
    33. D. Shasha and M. Snir. Efficient and correct execution of parallel programs that share memory. ACM Transactions on Programming Languages and Systems,10(2):282-312, 1988.
    34. Marron M, Dissertation, Degree of doctor of philosophy Computer Science, The University of New Mexico,2008.
    35. Ghiya R. and Hendren L.J. Is it a tree, a dag, or a cyclic graph? A shape analysis for heap-derected pointers in C. In Symposium on principles of Programming Languages, pages 1-15,1996.
    36. Gulwani S. and TIwari A. An abstract domain for analyzing heap-manipulationg low-level software. In Computer Aided Verification, pages 379-392,2007.
    37. Ghiya R, Hendren L.J. and Zhu Y. Detecting parallelism in C programs with recursive data structures. In Conference on Complier Construcion, pages 159-173,1998.
    38. Hendren L.J and Nicolau A. Parallelizing programs with recursive data strctures. IEEE Transactions on Parallel and Distributed Systems,1(1):35-47,1990.
    39. Gotsman A, Berdine J, and Cook B. Interprocedural shape analysis with separated heap abstractions. In Static Analysis ymposium, pages 240-260,2006.
    40. Guo B, Vachharajani N, and August D. Shape analysis with inductive recursion synthesis. In Conference on Programming Language Design and Implementation, pages 256-265, 2007.
    41. Berdine J, Calcagno C, Cook B, Distefano D, O'Hearn P, Wiesand T, and Yang H. Shape analysis for composite data structures. In Computer Aided Vverification, pages 178-192, 2007.
    42. Hirzel M, Diwan A, and Hertz M. Connectivity-based garbage collection. In Conference on Object-Oriented Programing, Systems, Languages, and Applications, pages 359-373, 2003.
    43. Srivastava A. Unreachable procedures in object oriented programming. ACM Letters on Programming Languages and Systems,355-364,1992.
    44. Dean, J, and Chambers C. Optimization of object-oriented programs using static class hierarchy analysis. Tech. Rep.94-12-01, Department of Computer Science, University of Washington at Seattle,1994.
    45. Dean, J, Grove, D, and Chambers C. Optimization of object-oriented programs using static class hierarchy analysis. In Proceedings of the Ninth European Conference on Object-Oriented Programming,1995, W.Oltho, Ed., Springer-Verlag,pp.77-101.
    46. Bacon D. F. Fast and Effective Optimization of Statically Typed Object-Oriented Languages. PhD thesis, Computer Science Division, University of California, Berkeley, Dec.1997. Report No. UCB/CSD-98-1017.
    47. Bacon D. F, and Sweeney P. F. Fast static analysis of C++ virtual function calls. In Proceedings of the Eleventh Annual Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA'96) (San Jose, CA,1996), pp.324-341. SIGPLAN Notices 31(10).
    48. Grove D, DeFouw G, Dean J, and Chambers C. Call graph construction in object-oriented languages. In Proceedings of the Twelfth Annual Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA'97) (Atlanta, GA,1997), pp.108-124. SIGPLAN Notices 32(10).
    49. Shivers O. Control-Flow Analysis of Higher-Order Languages. PhD thesis, CMU, May 1991.CMU-CS-91-145.
    50. Reynolds J. C.1969. Automatic computation of data set definitions. In Proceedings of the Information Processing Congress (IFIP). Vol.1. North-Holland, Amsterdam, the Netherlands,456-461.
    51. Jones N. D, and Muchnick S. S.1981. Flow analysis and optimization of lisp-like structures. In Program Flow Analysis:Theory and Applications, S. S.Muchnick and N. D. Jones, Eds. Prentice-Hall, Englewood Cliff,102-131.
    52. Aiken A.1999. Introduction to set constraint-based program analysis. Sci. Comput. Prog. 35,2-3,79-111.
    53. Aiken A, Andwimmers E. L.1992. Solving systems of set constraints. In Proceedings of the IEEE Symposium on Logic in Computer Science (LICS). IEEE Computer Society Press, Los Alamitos, CA,329-340.
    54. Aiken A, and Wimmers E. L.1993. Type inclusion constraints and type inference. In Proceedings of the ACM Conference on Functional Programming Languages and Computer Architecture (FPCA). ACM, New York,1-41.
    55. Heintze N.1994. Set-based analysis of ML programs. In Proceedings of the ACM Conference on Lisp and Functional Programming (LFP). ACM, New York,306-317.
    56. Heintze N, and Mcallester D.1997a. Linear-time subtransitive control flow analysis. In Proceedings of the ACM Conference on Programming Language Design and Implementation (PLDI). ACM, New York,261-272.
    57. Heintze N, and Mcallester D. A.1997b. On the cubic bottleneck in subtyping and flow analysis. In Proceedings of the IEEE Symposium on Logic in Computer Science (LICS). IEEE Computer Society Press, Los Alamitos, CA,342-351.
    58. Wagner D, Foster J. S, Brewer, E. A., and Aiken, A.2000. A first step towards automated detection of buffer overrun vulnerabilities. In Proceedings of the Network and Distributed System Security Symposium (NDSS). The Internet Society,3-17.
    59. Flanagan C.1997. Effective static debugging via componential set-based analysis. Ph.D. dissertation. Rice University.
    60. Foster J. S, Fahndrich M, and Aiken A.1997. Flow-insensitive points-to analysis with term and set constraints. Technical Report CSD-97-964, University of California, Berkeley, CA.
    61. Fahndrich M, Foster J. S, Su, Z.D, and Aiken A.1998. Partial online cycle elimination in inclusion constraint graphs. In Proceedings of the ACMConference on Programming Language Design and Implementation (PLDI). ACM, New York,85-96.
    62. Heintze N. and Tardieu O.2001b. Ultra-fast aliasing analysis using CLA:A million lines of C code in a second. In Proceedings of the ACM Conference on Programming Language Design and Implementation (PLDI). ACM, New York,254-263.
    63. Ghiya R, Lavery D, and Sehr D.2001. On the importance of points-to analysis and other memory disambiguation methods for C programs. In Proceedings of the ACM Conference on Programming Language Design and Implementation (PLDI). ACM, New York,47-58.
    64. Rountev A, Milanova A, and Ryder B. G.2001. Points-to analysis for Java using annotated constraints. In Proceedings of the ACM Conference on Object Oriented Programming Systems, Languages and Applications (OOPSLA). ACM, New York, 43-55.
    65. Lhotak O. and Hendren L. J.2003. Scaling Java points-to analysis using SPARK. In Proceedings of the Conference on Compiler Construction (CC). Lecture Notes in Computer Science, vol.2622. Springer-Verlag, New York,153-169.
    66. Pearce D. J, Kelly P. H. J, and Hankin C.2004b. Online cycle detection and difference propagation:Applications to pointer analysis. Softw. Qual. J.12,4,309-335.
    67. Landi W. Interprocedural Aliasing in the presence of Pointers. PhD thesis, Rutgers Univsersity, New Jersey, United States,1992.
    68. Horwitz S. Precise flow-insensitive may-alias analysis is NP-Hard. ACMTransactions on Programming Languages and Systems (TOPLAS),19(1):1-6, January 1997.
    69. Guyer S. Z. Incorporating Domain-Specific Information into the Compilation Process. PhD thesis, Department of Computer Science, University of Texas at Austin,2003.
    70. Emami M, Ghiya R, and Hendren L. J.1994. Context-sensitive interprocedural points-to analysis in the presence of function pointers. In Proceedings of the ACM Conference on Programming Language Design and Implementation (PLDI). ACM, New York,242-256.
    71. Wilson R. P. and Lam M. S.1995. Efficient context-sensitive pointer analysis for C programs. In Proceedings of the ACM Conference on Programming Language Design and Implementation (PLDI). ACM, New York,1-12.
    72. Hind M. and Pioli A. An empirical comparison of interprocedural pointer alias analyses. Technical Report RC 21058, IBM T.J. Watson Research Center,1997.
    73. Sharir M. and Pnueli A.Two approaches to interprocedural data flowanalysis. In S. Muchnick and N. Jones, editors, Program Flow Analysis:Theory and Applications, pages 189-234. Prentice Hall,1981.
    74. Rountev A, Milanova A, and Ryder B. G. Points-to analysis for Java using annotated constraints. In Conference on Object-Oriented Programming Systems, Languages, and Applications, pages 43-55, October 2001.
    75. Liang D, Pennings M, and Harrold M. J. Extending and evaluating flow-insensitive and context-insensitive points-to analyses for Java. InWorkshop on ProgramAnalysis for Software Tools and Engineering, pages 73-79, June 2001.
    76. Dean J, Grove D, and Chambers C. Optimizations of object-oriented programs using static class hierarchy analysis. In European Conference on Object-Oriented Programming, pages 77-101,1995.
    77. Diwan A, Moss J.E.B, and McKinley K. Simple and effective analysis of staticallytyped object-oriented programs. In Conference on Object-Oriented Programming Systems, Languages, and Applications, pages 292-305,1996.
    78. Shivers O. Control-flow Analysis of Higher-Order Languages or Taming Lambda. PhD thesis, Carnegie-Mellon University School of Computer Science,1991.
    79. Palsberg J. and Schwartzbach M. Object-oriented type inference. In Conference on Object-Oriented Programming Systems, Languages, and Applications, pages 146-161, 1991.
    80. Agesen O. The Cartesian product algorithm:Simple and precise type inference of parametric polymorphism. In European Conference on Object-Oriented Programming, pages 2-26,1995.
    81. Grove D. and Chambers C.A framework for call graph construction algorithms. ACM Transactions on Programming Languages and Systems (TOPLAS),23(6),2001.
    82. Milanova A, Rountev A, and Ryder B. Parameterized object-sensitivity for points-to and side-effect analyses for Java. In International Symposium on Software Testing and Analysis, pages 1-11,2002.
    83. Whaley J. and Lam M.S. An efficient inclusion-based points-to analysis for strictly-typed languages. In Static Analysis Symposium,2002.
    84. Whaley J, and Lam M.S. Cloning-Based context-sensitive pointer alias analysis using binary decision diagrams. In:William P, ed. Proc. of PLDI, New York:ACM Press 39(6), 2004.131-144
    85. Lattner C, Lenharth A, Adve V. Making context-sensitive points-to analysis with heap cloning practical for the real world. In:Jeanne F, ed. Proc. of PLDI, New York:ACM Press 2007,42(6):278-289.
    86. Pearce D.J, Kelly P.H.J, Hankin C. Efficient field-sensitive pointer analysis for C. In: Cormac F, ed. Proc. of PASTE, New York:ACM Press 2004.37-42.
    87. Aho A.V, Sethi R, and Ullman J. D.1986. Compilers:Principles, Techniques, and Tools. Addison-Wesley, Reading, MA.
    88. Banerjee U.1988. Dependence Analysis for Supercomputing. Kluwer Academic Publishers, Norwell, MA.
    89. Polychronopolous C. D.1988. Parallel Programming and Compilers. Kluwer Academic Publishers.
    90. Chatterjee R.1999. Modular data-flow analysis of statically typed object-oriented programming languages. Ph.D. Thesis, Department of Computer Science, Rutgers University.
    91. GallagherK. and Lyle J.1991. Using program slices in software maintenance. IEEE Transactions on Software Engineering 17,8,751-761.
    92. Gupta R. and Soffa M. L.1996. Hybrid slicing:An approach for refining static slices using dynamic information. In Proceedings of Third ACM SIGSOFT Symposium on Foundations of Software Engineering, pp.
    93. Bates S. and Horwitz S.1993. Incremental program testing using dependence graphs. In Conference Record of the Twentieth Annual ACM SIGACT/SIGPLAN Symposium on Principles of Programming Languages, pp.384-396.
    94. Ryder B.G. and Landi W. A, and Stocks P.A, and Zhang S, and Altucher R. A schema for interprocedural modification side-effect analysis with pointer aliasing, ACM Trans. Program. Lang. Syst., volume 23, issue 2,2001, pages 105--186, ACM, New York, NY, USA.
    95. Allen, F. E.1974. Interprocedural data flow analysis. In Proceedings of 1974 IFIP Congress, Amsterdam, Holland, pp.398-402. Institute of Electrical and Electronics Engineers, Inc., North Holland Publishing Company.
    96. Barth J. M.1978. A practical interprocedural data flow analysis algorithm. Communications of the ACM 21,9,724-736.
    97. Banning J.1979. An efficient way to find the side effects of procedure calls and the aliases of variables. In Conference Record of the Sixth Annual ACM SIGACT/SIGPLAN Symposium on Principles of Programming Languages.pp.29-41.
    98. Cooper K, and Kennedy K.1987. Complexity of interprocedural side-effect analysis. Computer Science Department Technical Report TR87-61, Rice University.
    99. Cooper K, and Kennedy K.1988. Interprocedural side-effect analysis in linear time. In Proceedings of the SIGPLAN'88 Conference on Programming Language Design and Implementation, pp.57-66.
    100. Choi J. D, Burke M, and Carini P.1993. Efficient flow-sensitive interprocedural computation of pointer-induced aliases and side effects. In Conference Record of the Twentieth Annual ACM SIGACT/SIGPLAN Symposium on Principles of Programming Languages, pp.232-245.
    101. Rountev A. Precise identification of side-effect-free methods in Java.In ICSM'04: Proceedings of the 20th IEEE International Conference on Software Maintenance, pages 82-91, Sept.2004.
    102. Salcianu A, and Rinard M. Purity and side effect analysis for Java programs. In VMCAI'05:Proceedings of the 6th International Conference on Verification, Model Checking, and Abstract Interpretation, volume 3385 of LNCS:Lecture Notes in Computer Science, pages 199-215, Jan.2005. http://jppa.sourceforge.net.
    103. Eckel B著,侯捷译。《lava编程思想》,机械工业出版社,2002年9月第一版,新华书店北京发行所发行。
    104. Burke M, Carini P, Choi J.D, andHind M. Flow-insensitive interprocedural aliasanalysis in the presence of pointers.Proceedings from the7th Workshop on Languages and Compilers for ParallelComputing.1994.
    105. Anatole Le, Lhotak O, and Hendren L. Using inter-procedural side-effect informationin JIT optimizations. In Proceedings of the 14th International Conference on CompilerConstruction (CC). Lecture Notes in Computer Science, vol.3443. Springer Verlag. (2005).287-304.
    106. Tkachuk O, and Dwyer M. B. Adapting side effects analysis for modular programmodel checking. In ESEC/FSE, pages 188-197, Sep.2003.
    107. Rountev A, Chandra S. Off-Line variable substitution for scaling points-to analysis. In: Monica L, ed. Proc. of the 2000 Conf. on Programming Language Design and Implementation. New York:ACM Press,2000.47-56.
    108. Hardekopf B, Lin C. The ant and the grasshopper:Fast and accurate pointer analysis for millions of lines of code. In:Jeanne F, ed. Proc. of the Programming Language Design and Implementation (PLDI). New York:ACM Press,2007.290-299.
    109. Mendez-Lojo M, Mathew A, Pingali K. Parallel inclusion-based points-to analysis. In: William C, ed. Proc. of the SPLASH, New York:ACM Press,2010.428-443.