软件质量预测模型中的若干关键问题研究

英文题名：Research of Some Key Issues in Software Quality Prediction Model
作者：王琪
论文级别：博士
学科专业名称：电路与系统
中文关键词：软件质量预测 ; 属性选择 ; 分类器集成 ; 规则提取
英文关键词：software quality prediction ; features selection ; classifiers combination ; rule extraction
学位年度：2007
导师：朱杰
学科代码：080902
学位授予单位：上海交通大学
论文提交日期：2006-12-01

摘要

随着软件产业的迅猛发展,软件规模越来越大,开发过程也随之越来越难以控制,而且软件失效所带来的损失也愈加严重,软件质量预测系统在这种情况下应运而生。这一系统致力于在软件开发的早期发现有失效倾向的模块,从而在后续的开发和系统测试中能够优化资源配置,以达到最大限度地降低软件失效数量的目的。实践证明,该系统的应用能够有效地缩短软件产品的开发周期,降低软件维护成本并同时提高软件的质量。本文以朗讯科技光网络有限公司的某三个大型商用软件产品为基础对软件质量预测模型中的属性选择、分类器集成和规则抽取等问题进行了重点研究。
     本文首先提出了软件质量预测系统的框架,将软件质量预测系统划分为三个部分:前端、核心和后端,并阐明了每个部分的主要任务。在系统的前端,主要进行数据的预处理和属性选择。经过数据预处理和属性选择以后的数据被划分为训练集和测试集进入系统的核心部分。在核心部分,根据选定的算法对软件质量预测模型进行训练,并用测试集数据进行测试。核心部分得到的结果包括训练好的模型描述和模型在测试集上的预测结果。后端的主要任务则是对已有的模型描述进行规则提取,并根据模型在测试集上的预测结果对其进行比较和评估。
     其次,在软件质量预测的前端,本文提出了一种基于遗传算法的属性选择方法(CFGA),它在同一个遗传进化过程中完成聚类和属性选择两个动作,以聚类的效果作为属性选择进化的适应度函数,可以通过调整参数来使聚类得到的簇变得紧密或者疏松。这一方法既可以适用于无经验数据的情况,也适用于有经验数据的情况。对于无经验数据的情况,聚类的结果将由软件领域专家们进行进一步的分析。而经过聚类和属性选择后的数据集将会大大减少专家的工作量。对于有经验数据的情况,聚类后得到的簇将会被分为三类:孤点簇、高纯度簇和低纯
With the rapid development of software industry, software products become more and more complicated, and developing processes become more and more difficult to control. Moreover, software failures always lead to great loss in practice. In such circumstance, the software quality prediction system is proposed to identify the fault-prone modules in the early phase of software development life cycle, so that resources can be correctly allocated in the following development and testing. This approach has been proved to be able to effectively shorten development cycle, reduce maintenance cost and highly improve software quality. This dissertation studies some key issues in software quality prediction modeling, including feature selection, classifiers combination and rules extraction. It is based on three large telecommunication softwares developed by Lucent Technology Optical Networks Ltd.
     Firstly, a frame of software quality prediction is proposed in this dissertation. This frame consists of three parts: the former, kernel and the latter. In the former stage, the main task of software quality prediction is feature selection and data preprocessing. The data set having been selected and preprocessed is divided to training set and testing set. In the kenel stage, software quality model is trained with specific traing algorithm based on training set and the trained model is tested over testing set. The results of kenel stage include the description and prediction result of trained model. The main task of latter stage is to extract rules from trained model and evaluate its performance.
     Secondly, a clustering and feature selection approach based on genetic algorithm (CFGA) is introduced in the former stage. The clustering and feature selection are performed in the same evolution phase. The fitness function of feature selection is described by clustering results. Some parameters in the fitness function can be adjusted to make the clusters loose or compact. This approach can work in both supervised learning and unsupervised learning. For unsupervised learning, the results of clustering will be further analyzed by software expert. The clustering and feature selection can significantly reduce the workload of expert. While for supervised learning, the clusters will be classified to outlier, high purity clusters and low purity clusters. The outliers

引文

[1] 杨芙清、梅宏、吕建、金芝。浅论软件技术发展。电子学报,2002, 30(12A):1901?1906.
    [2] 张效祥,主编.计算机科学技术百科全书.北京:清华大学出版社,1998.
    [3] Mitchell, T. 1997a. Machine Learning, McGraw-Hill, New York.
    [4] Brooks, F. No silver bullet—essence and accidents of software engineering methods. IEEE transaction on Software Engineering 27(1): 42-57.
    [5] Brooks, F. 1995. The Mythical Man-Month, Addison-Wesley, Reading, MA.
    [6] Green, C. et al. 1986. Report on a knowledge-based software assistant, In C. Rich and R.C. Waters (eds.), Readings in Artificial Intelligence and Software Engineering, Morgan Kaufmann, San Mateo, CA, pp. 377–428.
    [7] Lowry, M. 1992. Software engineering in the twenty first century, AI Magazine 14(3): 71–87.
    [8] Devanbu, P., Brachman, R., Selfridge, P., and Ballard, B. 1991. LaSSIE: A knowledge-based software information system, Communications of ACM 34(5): 35–49.
    [9] Broy, M. 2001. Toward a mathematical foundation of software engineering methods, IEEE Transactions on Software Engineering 27(1): 42–57.
    [10] Liu, A. and Tsai, J.J.P. 1996. A knowledge-based approach for requirements analysis, International Journal of Artificial Intelligence Tools 5(2): 167–184.
    [11] Partridge, D. 1998. Artificial Intelligence and Software Engineering, AMACOM.
    [12] Partridge, D., Wang, W., and Jones, P., 2001. Artificial intelligence techniques for software system enhancement, Research Report No. 399, School of Engineering and Computer Science, University of Exeter, U.K.
    [13] Tsai, J.J.P., Li, B., and Weigert, T. 1998. A logic-based transformation system, IEEE Transactions on Knowledge and Data Engineering 10(1): 91–107.
    [14] Barbara Kitchenham. Towards a constructive quality model: part 1: software quality modeling, measurement and prediction. Software Quality Journal, 2(4), July 1987, pp105-113.
    [15] Ebert, C. Classification techniques for metric-based software development. Software QualityJournal5(4): 255–272. 1996.
    [16] Ohlsson, M. C., Helander, M., and Wohlin, C. Quality improvement by identification of fault-prone modules using software design metrics. In Proceedings: International Conference on Software Quality, Ottawa, Ontario, Canada, pp. 1–13. 1996.
    [17] Ohlsson, N., Zhao, M., and Helander, M. Application of multivariate analysis for software fault prediction. Software Quality Journal 7(1): 51–66. 1998.
    [18] Briand, L.C.; Basili, V.R.; Thomas, W.M. A pattern recognition approach for software engineering data analysis. IEEE Transactions on Software Engineering, 18(11): 931 –942, Nov. 1992.
    [19] C. Ebert. Classification techniques for metric-based software development. Software Quality Journal 5(4): 255-272, Dec .1996.
    [20] Khoshgoftaar, T.M.; Allen, E.B.; Jones, W.D.; Hudepohl, J.I. Classification tree models of software quality over multiple releases. In Proceedings of the 10th International Symposium on Software Reliability Engineering, 1(4): 116 –125, Nov. 1999.
    [21] Khoshgoftaar, T.M.; Seliya, N. Tree-based software quality estimation models for fault prediction. Proceedings of Eighth IEEE Symposium on Software Metrics, 4-7 June 2002, 203 – 214.
    [22] Khoshgoftaar, T.M.; Allen, E.B.; Hudepohl, J.P.; Aud, S.J. Application of neural networks to software quality modeling of a very large telecommunications system. IEEE Transactions on Neural Networks, 8: 4 (July 1997), 902 – 909.
    [23] Tong-Seng Quah; Mie Mie Thet Thwin. Application of neural networks for software quality prediction using object-oriented metrics. Proceedings of International Conference on Software Maintenance, 22-26 Sept, 2003,116 – 125.
    [24] Khoshgoftaar, T.M.; Ganesan, K.; Allen, E.B.; Ross, F.D.; Munikoti, R.; Goel, N.; Nandi, A.. Predicting fault-prone modules with case-based reasoning. Proceedings of The Eighth International Symposium On Software Reliability Engineering, 2-5 Nov1997, 27 – 35.
    [25] K. El Emam, S. Benlarbi, and N. Goel. Comparing Case-Based Reasoning Classifiers for Predicting High Risk Software Components. Technical Report, National Research Council of Canada, NRC/ERB-1058, September 1999.
    [26] Khoshgoftaar, T. M., Allen, E. B., Jones, W. D., and Hudepohl, J. P. Classification treemodels of software-quality over multiple releases. IEEE Transactions on Reliability 49(1): 4–11. 2000.
    [27] Khoshgoftaar, T. M., and Allen, E. B. Controlling overfitting in classification-tree models of software quality. Empirical Software Engineering 6(1): 59–79. 2001.
    [28] Khoshgoftaar, T. M., and Seliya, N. Software quality classification modeling using the SPRINT decision tree algorithm. In Proceedings: 14th International Conference on Tools with Artificial Intelligence. Washington, DC, USA: IEEE Computer Society, November, pp. 365–374. 2002.
    [29] Khoshgoftaar, T. M., and Seliya, N. Fault Prediction Modeling for Software Quality Estimation: Comparing Commonly Used Techniques. Empirical Software Engineering, 8, 255–283, 2003.
    [30] Khoshgoftaar, T. M., and Seliya, N. Comparative Assessment of Software Quality Classification Techniques: An Empirical Case Study. Empirical Software Engineering, 9, 229–257, 2004.
    [31] Hudepohl, J.P.; Aud, S.J.; Khoshgoftaar, T.M.; Allen, E.B.; Mayrand, J.; Emerald: software metrics and models on the desktop. IEEE Software, 13 (5) , Sept 1996, pp56 – 60.
    [32] 张家海、胡恒章。组合导航系统可靠性的神经网络静态预测。哈尔滨工业大学学报,34(5),2002 年 10 月。
    [33] Briand, L., Basili, V., and Thomas,W. 1992. A pattern recognition approach for software engineering data analysis, IEEE Transactions on Software Engineering 18(11): 931–942.
    [34] Briand, L. et al. 1999. An assessment and comparison of common software cost estimation modeling techniques, In Proc. International Conference on Software Engineering, pp. 313–322.
    [35] Chulani, S., Boehm, B., and Steece, B. 1999. Bayesian analysis of empirical software engineering cost models, IEEE Transactions on Software Engineering 25(4): 573–583.
    [36] Shepperd, M. and Schofield, C. 1997. Estimating software project effort using analogies, IEEE Transactions on Software Engineering 23(12): 736–743.
    [37] Vicinanza, S., Prietulla, M.J., and Mukhopadhyay, T. 1990. Case-based reasoning in software effort estimation, In Proc. of 11th International. Conference on Information Systems, pp. 149–158.
    [38] Jorgensen, M. 1995. Experience with the accuracy of software maintenance task effort prediction models, IEEE Transactions on Software Engineering 21(8): 674–681.
    [39] de Almeida, M. and Matwin, S. 1999. Machine learning method for software quality model building, In Proceeding Of International Symposium on Methodologies for Intelligent Systems.
    [40] Dolado, J. 2000. A validation of the component-based method for software size estimation, IEEE Transactions on Software Engineering 26(10): 1006–1021.
    [41] Karunanithi, N.,Whitely, D., and Malaiya, Y. 1992. Prediction of software reliability using connectionist models, IEEE Transactions on Software Engineering 18(7): 563–574.
    [42] Selby, R. and Porter, A. 1988. Learning from examples: generation and evaluation of decision trees for software resource analysis, IEEE Transactions on Software Engineering 14: 1743–1757.
    [43] Dohi, T., Nishio, Y., and Osaki, S. 1999. Optimal software release scheduling based on artificial neural networks, Annals of Software Engineering 8(1): 167–185.
    [44] Ryan, C. 2000. Automatic Re-engineering of Software Using Genetic Programming, Kluwer Academic, Dordrecht.
    [45] Ryan, C. and Ivan, L. 1999. An automatic software re-engineering tool based on genetic programming, In L. Spector et al. (eds.), Advances in Genetic Programming, Vol. 3, MIT Press, pp. 15–39.
    [46] Schwanke, R. and Hanson, S. 1994. Using neural networks to modularize software, Machine Learning 15(2): 137–168.
    [47] Choi, S. and Wu, C. 1998. Partitioning and allocation of objects in heterogeneous distributed environments using the niched Pareto genetic algorithm, In Proc. of the Asia–Pacific Software Engineering Conference, pp. 322–329.
    [48] Bergadano, F. and Gunetti, D. 1996. Testing by means of inductive program learning, ACM Trans. Software Engineering and Methodology 5(2): 119–145.
    [49] Michael, C., McGraw, G., and Schatz, M. 2001. Generating software test data by evolution, IEEE Transactions on Software Engineering 27(12) 1085–1110.
    [50] Bhansali, S. and Harandi, M. 1993. Synthesis of Unix programs using derivational analogy, Machine Learning 10(1): 7–55.
    [51] Bailin, S.C., Gattis, R.H., and Truszkowski, W. 1991. A learning-based software engineering environment, In Proc. 6th Annual Knowledge-Based Software Engineering Conference, pp. 198–206.
    [52] Harandi, M.T. and Lee, H.Y. 1991. Acquiring software design schemas: a machine learning perspective, In Proc. of 6th Annual Knowledge-Based Software Engineering Conference, pp. 188–197.
    [53] Qureshi, A. 1996. Evolving agents, Research Note, University College London, RN-96-4.
    [54] Chang, C., Christensen, M., and Zhang, T. 2001. Genetic algorithms for project management, Annals of Software Engineering 11(1): 107–139.
    [55] Ostertag, E., Hendler, J., Diaz, R.P., and Braun, C. 1992. Computing similarity in a reuse library system: an AI-based approach, ACM Transactions on Software Engineering and Methodology 1(3): 205–228.
    [56] Fouque, G. and Matwin, S. 1992. CAESAR: a system for case based software reuse, In Proc. of 7th Knowledge-Based Software Engineering Conference, pp. 90–99.
    [57] Katalagarianos, P. and Vassiliou, Y. 1995. On the reuse of software: a case-based approach employing a repository, Automated Software Engineering 2(1): 55–86.
    [58] Lee, B., Moon, B., and Wu, C. 1998. Optimization of multi-way clustering and retrieval using genetic algorithms in reusable class library, In Proc. of the Asia–Pacific Software Engineering Conference, pp. 4–11.
    [59] Basili, V., Condon, S., El Emam, K., Hendrick, R., and Melo, W. 1997. Characterizing and modeling the cost ofrework in a library of reusable software components, In Proc. International Conference on Software Engineering, pp. 282–291.
    [60] Drummond, C., Ionescu, D., and Holte, R. 2000. A learning agent that assists the browsing of software libraries, IEEE Transactions on Software Engineering 26(12): 1179–1196.
    [61] Hill, W.L. 1987. Machine learning for software reuse, In Proc. of International Joint Conference on Artificial Intelligence, pp. 338–344.
    [62] van Lamsweerde and Willemet, L. 1998. Inferring declarative requirements specification from operational scenarios, IEEE Transactions on Software Engineering 24(12): 1089–1114.
    [63] Hall, R. 1995. Systematic incremental validation of reactive systems via sound scenario generalization, Automated Software Engineering 2(2): 131–166.
    [64] Hall, R. 1998. Explanation-based scenario generation for reactive system models, In Proc. of International Conference on Automated Software Engineering, pp. 115–124.
    [65] Cohen, W. 1995. Inductive specification recovery: understanding software by learning from example behaviors, Automated Software Engineering 2(2): 107–129.
    [66] Partridge, D., Wang, W., and Jones, P., 2001. Artificial intelligence techniques for software system enhancement, Research Report No. 399, School of Engineering and Computer Science, University of Exeter, U.K.
    [67] Salah Bouktif, Bal′azs K′egl, Houari Sahraoui. Combining Software Quality Predictive Models: An Evolutionary Approach. Proceedings of the International Conference on Software Maintenance (ICSM.02), Oct. 2002, pp:385 – 392.
    [68] Idri, A.; Khoshgoftaar, T.M.; Abran, A. Can neural networks be easily interpreted in software cost estimation? Proceedings of the 2002 IEEE International Conference on Fuzzy Systems, (2) 12-17, May 2002 pp: 1162 – 1167.
    [69] N.E. Fenton and S.L. Pfleeger. Software Metrics. PWS Publising Company. New York. 2nd edition, 1997.
    [70] Khoshgoftaar, T. M., Allen, E. B., and Shan, R.. Improving tree-based models of software quality withprincipal components analysis. In Proceedings of the Eleventh International Symposium on Software Reliability Engineering. San Jose, California, USA, 198–209. 2000
    [71] G.H. John, R. Kohavi, and K. Pfleger. Irrelevant features and the subset selection problem. In Proceeding of 11th International Conference on Machine Learning, pages 121-129, San Mateo, CA, 1994, Morgan Kaufmann.
    [72] A. Blumer, A.Ehrenfeucht, D.Haussler, and M. K. Warmuth. Occam’s razor. Information Processing Letters, 24:377-380, 1987.
    [73] J. Rissanen. Stochastic complexity and modeling. The Annals of Statistics, 14:1080-1100, 1986.
    [74] A. Jain and D. Zongker. Feature selection: Evaluation, application and small sample performance. IEEE Transaction on Pattern Analysis and Machine Intelligence, 19(2):153-158, 1997.
    [75] R. Kohavi and G. H. John. Wrappers for feature subset selection. Artificial Intelligence,97(1-2):273-324, 1997.
    [76] D. Koller and M. Sahami. Toward optimal feature selection. In Proceeding of 13th International Conference on Machine Learning, pages 284-292, San Francisco, 1996. Morgan Kaufmann.
    [77] R. A. Johnson and D. W. Spears. Using genetic algorithms for concept learning. In Proceeding of 3rd International Conference on Genetic Algorithms, pp 124-132, Fairfax, VA, 1989. George Mason University.
    [78] A. Jain, M. Murty, and P. Flynn. Data clustering: A review. ACM Computing Surveys, 31(3): 264-323, 1999.
    [79] R. Duda and P. Hart. Pattern Classification and Scene Analysis. John Wiley and Sons, New York, 1973.
    [80] A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39(1): 1-38, 1997.
    [81] R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic subspace clustering of high dimensional data for data mining applications. In Proceeding of the ACM SIGMOD International Conference on Management of Data, pages 94-105, Seattle, WA, 1998.
    [82] P. Smyth. Clustering using Monte Carlo cross-validation. In Proceeding of 2nd International Conference on Knowledge Discovery and Data Mining (KDD-96), pages 126-133, 1996.
    [83] R. Kohavi and D. Sommerfield. Feature subset selection using the wrapper method: Overfitting and dynamic search space topology. In Proceeding of 1st International Conference on Knowledge Discovery & Data Mining, pages 192-197. AAAI Press, 1995.
    [84] C. Apt, F. J. Damerau, and S. M. Weiss. Automated Learning of decision rules for text categorization. ACM Transactions on Information Systems, 12(3): 233-251, 1994.
    [85] R. Battiti. Using mutual information for selecting features in supervised neural net learning. IEEE Transactions on Neural Networks, 5(4): 537-550, July 1994.
    [86] 于洪、杨大春、吴中福、李华。基于信息熵的一种属性约简方法。计算机工程与应用。17(22-25), 2001 年。
    [87] P. A. Devijver and J. Kittler. Pattern Recognition: A Statistical Approach. Prentice hall International, 1982.
    [88] 梁吉业、曲开社、徐宗本。信息系统的属性约简。系统工程理论与实践。12 期,2001年。
    [89] H. Almuallim and T. G. Dietterich. Learning with many irrelevant features. In Proceeding of 9th National Conference on Artificial Intelligence, pages 547-552. AAAI Press, 1991.
    [90] H. Almuallim and T. G. Dietterich. Efficient algorithms for identifying relecant features. In Proceeding of 9th Canadian Conference on Artificial Intelligence, Vancouver, Britich Columbia, pages 38-45. Morgan Kaufmann, 1992.
    [91] R. Caruana and D. Freitag. Greedy attribute selection. In Proceeding of 11th International Conference on Machine Learning, pages 28-36, New Brunswick, NJ, 1994. Morgan Kaufmann.
    [92] K. Kira and L. A. Rendell. The features selection problem: Traditional methods and a new algorithm. In 10th National Conference on Artificial Intelligence, pages 129-134. MIT Press, 1992.
    [93] I. Kononenko. Estimating attributes: Analysis and extension of RELIEF. In Proceeding of European Conference on Machine Learning, pages 171-182, Berlin, 1994. Springer-Verlag.
    [94] 张丽新、王家庼、赵雁南、杨泽红??Relief 的组合式特征选择。复旦学报(自然科学版)。第 43 卷,第 5 期,2004。
    [95] P. Langeley and S. Sage. Oblivious decision trees and abstract cases. In Working Notes of the AAAI-94 Workshop on Case-Based Reasoning, pages 113-117, Seattle, WA, 1994. AAAI Press.
    [96] D. W. Aha and R. L. Bankert. Feature selection for case-based classification of cloud types: An empirical Comparison. In Working Notes of the AAAI-94 Workshop on Case-based Reasoning, pages 106-112, 1994.
    [97] H. Liu and R. Setiono. Chi2: Feature selection and classification – a probabilistic wrapper approach. In Proceeding of 9th International Conference on Industrial & Engineering Applications of AI and Expert Systems, pages 419-424, Fukuoka, Japan, 1996.
    [98] A. W. Moore and M. S. Lee. Efficient algorithms for minimizing cross validation error. In Proceeding of 11th International Conference on Machine Learning, pages 190-198, New Brunswick, NJ, 1994. Morgan Kaufmann.
    [99] R. Kohavi. Feature subset selection as search with probabilistic estimates. In AAAI Fall Symposium on Relevance, pages 122-126, 1994.
    [100] L.P. Kaelbling. Learning in Embedded Systems. MIT Press, 1993.
    [101] Barnett, V., Lewis, T., 1984. Outliers in statistical data. Wiley.
    [102] Khoshgoftaar, T. M., Allen, E. B., and Deng, J. 2002. Using regression trees to classify fault-prone software modules. IEEE Transactions on Reliability 51(4): 455–462.
    [103] 张斌、李夕海、苏娟、刘代志。基于分类器集成的核爆地震模式识别。计算机工程与应用,2004 年 26 期。
    [104] 周艳平、夏利民、宋星光。基于 Gabor 滤波的多分类器集成人脸表情识别。长沙交通学院学报,2005 年第 2 期。
    [105] 杨波、赵学军、乔进、叶俊勇、彭健。一种识别手写字符的多分类器集成方法。重庆大学学报(自然科学版),2001 年第 3 期。
    [106] Hansen, L., & Salamon, P. Neural network ensembles. IEEE Transaction of Pattern Analysis and Machine Intelligence, 12, 993-1001, 1990.
    [107] Freund, Y., & Schapire, R. E. A decision-theoretic generalization of online learning and an application to boosting. Technical Report, AT&T Bell Laboratories, Murray Hill, NJ, 1995.
    [108] Freund, Y., & Schapire, R.E. Experiments with a new boosting algorithm. In Proceedings of the Thirteenth International Conference on Machine Learning, pp. 145-156, 1996
    [109] Cherkauer, K.J. Human expert-level performance on a scientific image analysis task by asystem using combined artificial neural networks. In Working Notes of the AAAI Workshop on Integrating Multiple Learned Models, pp. 15-21, 1996.
    [110] Tumer, K., & Ghosh, J. Error correlation and error reduction in ensemble classifiers. Connection Science, 8(3-4), 385-404, 1996.
    [111] Dietterich, T.G., & Bakiri, G. Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research, 2, 263-286, 1995.
    [112] Schapire, R.E. Using output codes to boost multiclass learning problems. Technical Report, AT&T Research, 1997.
    [113] Ricci, F., & Aha, D.W. Extending local learners with error-correcting output codes. Technical Report, Naval Center for Applied Research in Artificial Intelligence, 1997.
    [114] Kolen, J.F., & Pollack, J.B. Back propagation is sensitive to initial conditions. In Advances in Neural Information Processing Systems, vol. 3, pp. 860-867, 1991.
    [115] Parmanto, B., Munro, P.W., & Doyle, H. R. Improving committee diagnosis with resampling techniques. In Advances in Neural Information Processing Systems, vol. 8, pp. 882-888, 1996.
    [116] Kwok, S. W., & Carter, C. Multiple decision trees. In Uncertainty in Artificial Intelligence 4, pp. 327-335. Elsevier Science, Amsterdam.
    [117] Dietterich, T.G., & Kong, E.B. Machine Learning bias, statistical bias, and statistical variance of decision tree algorithms. Technical Report, Department of Computer Science, Oregon State University, Corvallis, Oregon, 1995.
    [118] Opitz, D.W., & Shvlik, J.W. Generating accurate and diverse members of a neural network ensemble. In Advances in Neural Information Processing Systems, 8, pp. 535-541 Cambridge, MA. MIT Press, 1996.
    [119] Clemen, R.T. Combining forecasts: A review and annotated bibliography. International Journal of Forecasting, 5, 559-583, 1989.
    [120] Perrone, M.P., & Cooper, L.N. When networks disagree: Ensemble methods for hybrid neural networks. In Neural networks for speech and image processing, 1993.
    [121] Hashem, S. Optimal Linear Combinations of neural networks. Ph. D. thesis, Purdue University, School of Industrial Engineering, Lafayette, IN.
    [122] Ali, K.M., & Pazzani, M.J. Error reduction through learning multiple descriptions. Machine Learning, 24(3), 173-202, 1996.
    [123] Buntine, W. L. A theory of learning classification rules. Ph. D. thesis, University of Technology, School of Computing Science, Sydney, Australia, 1990.
    [124] Jordan, M. I., & Jacobs, R.A. Hierarchical mixtures of experts and the EM algorithm. Neural Computation, 6(2), 181-214, 1994.
    [125] Josef Kittler, Mohamad Hatef, Robert P.W. Duin and Jiri Matas. On combining classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol 20, No. 3, March 1998.
    [126] Hyafil, L., & Rivest, R.L. Constructing optimal binary decision trees is NP-Complete. Information Processing Letters, 5(1), 15-17, 1976.
    [127] Blum, A., & Rivest, R. L. Training a 3-node neural network is NP-Complete (Extended abstract). In Proceeding of the 1988 Workshop on Computational Learning Theory, pp. 9-18 San Francisco, CA. Morgan Kaufmann, 1988.
    [128] S. S. Gokhale and M. R. Lyu. Regression tree modeling for the prediction of softw are quality. In H. Pham, editor, Pr oceedings of the Thir d ISSAT International Conference on Reliability and Quality in Design, pages 31{36, Anaheim, CA, Mar. 1997. International Society of Science and Applied Technologies.
    [129] T. M. Khoshgoftaar, E. B. Allen, W. D. Jones, and J. P . Hudepohl. Data mining for predictors of soft w are quality. International Journal ofSoftware Engineering and Knowledge Engineering, 9(5):547{563, 1999.
    [130] 唐华松、姚耀文。数据挖掘中决策树算法的探讨。计算机应用研究。8(21-24),2001年。
    [131] A. R. Gray and S. G. Macdonnel Soft w are metrics data analysis - Exploring the relativ e performance of some commonly used modeling techniques. Journal of Empirical Software Engineering, 4:297-316, 1999.
    [132] Tu Peilei; Chung Jenyao. A new decision tree classification algorithm for machine learning. Proceeding of IEEE International Conference on Tools for Artificial Intelligence. Arlington ,1992.
    [133] Quinlan J R. Induction of Decision Trees. Machine Learning, Jan 1986, pp81～106.
    [134] Porter, A.; Selby, R. Empirically-guided software development using metric-based classification trees. IEEE software, 7, 1990, pp46～54.
    [135] Briand, L.; Basili, V.; Thomas, W. A pattern recognition approach for software engineering data analysis. IEEE Transaction on Software Engineering, 18 (11), 1992, pp931～942.
    [136] Takahashi, R.; Muraoka, Y.; Nakamura, Y. Building software quality classification trees: approach, experimentation, evaluation. Proceedings of the eighth International Symposium On Software Reliability Engineering, 2-5 Nov 1997, pp222 – 233.
    [137] Tian, J. Early measurement and improvement of software quality. Proceedings of The Twenty-Second Annual International Conference of Computer Software and Applications, 19-21 Aug 1998, pp196 – 201.
    [138] Khoshgoftaar, T.M.; Allen, E.B.; Naik, A.; Jones, W.D.; Hudepohl, J. Using classification trees for software quality models: lessons learned. Proceedings of the Third IEEE International Symposium on High-Assurance Systems Engineering, 13-14 Nov 1998, pp82 – 89.
    [139] Khoshgoftaar, T.M.; Allen, E.B.; Xiaojing Yuan; Jones, W.D.; Hudepohl, J.P. Assessing uncertain predictions of software quality. Proceedings of the Sixth International Software Metrics Symposium 4-6 Nov 1999, pp159 – 168.
    [140] Khoshgoftaar, T.M.; Allen, E.B.; Jones, W.D.; Hudepohl, J.I. Classification tree models of software quality over multiple releases. Proceedings of 10th International Symposium on Software Reliability Engineering, 1-4 Nov. 1999, pp116 – 125.
    [141] Khoshgoftaar, T.M.; Ruqun Shan; Allen, E.B. Improving tree-based models of software quality with principal components analysis. Proceedings of the 11th International Symposium on Software Reliability Engineering, 8-11 Oct. 2000, pp198 – 209.
    [142] Khoshgoftaar, T.M.; Seliya, N. Tree-based software quality estimation models for fault prediction. Proceedings of the Eighth IEEE Symposium on Software Metrics, 4-7 June 2002, pp203 – 214.
    [143] Mcclloch, W.S.; Pitts, W. A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics, 1943, 10(5): 115 - 133.
    [144] Khoshgoftaar, T.M.; Pandya, A.S.z More, H.B. A neural network approach for predicting software development faults. Proceeding of Third International Symposium on Software Reliability Engineering, 7-10 Oct 1992, pp83 – 89.
    [145] Khoshgoftaar, T.M.; Szabo, R.M. Improving neural network predictions of software quality using principal components analysis. International Conference on Neural Networks, 5 (27) June-2 July, 1994, pp3295 – 3300.
    [146] Khoshgoftaar, T.M.; Allen, E.B.; Hudepohl, J.P.; Aud, S.J. Application of neural networks to software quality modeling of a very large telecommunications system. IEEE Transactions on Neural Networks, 8 (4), July 1997, pp902 – 909.
    [147] Kumar, R.; Rai, S. Trahan, J.L. Neural-network techniques for software-quality evaluation. Proceedings of 1998 Annual Reliability and Maintainability Symposium, 19-22 Jan 1998, pp155 – 161.
    [148] Tong-Seng Quah; Mie Mie Thet Thwin. Application of neural networks for software quality prediction using object-oriented metrics. Proceedings of International Conference on Software Maintenance, 22-26 Sept. 2003, pp116 – 125.
    [149] 吴凌云。基于神经网络的故障诊断专家系统。现代电子技术,2003 年第 1 期。
    [150] 李宗福、邓琼波、李桓。kohonen SOFM 神经网络及其演化研究。计算机工程与设计,2004 年 10 期。
    [151] 赵玲。基于 Kohonen 网络的入侵检测方法。信息安全与通信保密,2005 年第 3 期。
    [152] 张友水、冯学智、阮仁宗、麻土华。kohonen 神经网络在遥感影像分类中的应用研究。遥感学报,2004 年 02 期。
    [153] 姚志宏。 kohonen 网络在汽轮机振动故障诊断中的应用。汽轮机技术,2004 年 01 期。
    [154] Bigus, Joseph P., Data Mining with Neural Networks. McGraw Hill, 1996.
    [155] Tsoukalas, H. Lefteri, Uhrig, E. Robert, Fuzzy and Neural Approaches in Engineering. John Wiley & Sons, 1997.
    [156] Andrews R, Diederich J, Tickle A B. Survey and critique of techniques for extracting rules from trained artificial neural networks. Knowledged-Based Systems, 1995, 8(6):373-389.
    [157] Tickle A B, Andrews R, Golea M, Diederich J. The truth will come to light: Directions and challenges in extracting the knowledge embedded within trained artificiall neural networks. IEEE Transactions on Neural Networks, 1998, 9(6): 1057-1068.
    [158] Zhou Z H, Jiang Y, Chen S F. A general neural framework for classification rule mining. International Journal of Computers, Systems and Signals, 2000, 1(2): 154-168.
    [159] Towell G G, Shavlik J W. Interpretation of artificial neural networks: Mapping knowledge-based neural networks into rules. In Advances in Neural Information Processing Systems 4, San Mateo, Morgan Kaufmann, 1992, 977-984.
    [160] Setiono R. Extracting rules from neural networks by pruning and hidden-unit spitting. Neural Computation, 1997, 9(1): 205-225.
    [161] Fu L. Rule learning by searching on adapted nets. In Proceedings of the 9th National Conference on Artificial Intelligence, Anaheim, CA, 1991, 590-595.
    [162] Krishman R, Sivakumar G, Bhattacharya P. Extracting decision trees from trained neural networks. Pattern Recognition, 1999, 32(12): 1999-2009.
    [163] Setiono R, Liu H. NeuroLinear: From neural networks to oblique decision rules. Neuralcomputing, 1997, 17(1): 1-24.
    [164] Andrews R, Geva S. Rule extraction from a constrained error back propagation MLP. In Proceedings of the 5th Australian Conference on Neural Networks, Brisbane, Australia, 1994, 9-12.
    [165] Saito K, Nakano R. Law discovery using neural networks. In Proceedings of the NIPS’96 Workshop on Rule Extraction from Trained Artificial Neural Networks, Denver, CO, 1996, 62-69.
    [166] Taha I A, Ghosh J. Symbolic interpretation of artificial neural networks. IEEE Trasaction on Knowledge and Data Engineering, 1999, 11(3): 448-463.
    [167] Tresp V, Hollatz J, Ahmad S. Network structuring and training using rule-based knowledge. In Advances in Neural Information Processing Systems 4, San Mateo, Morgan Kaufmann, 1993, 871-878.
    [168] Setiono R, Liu H. Understanding neural networks via rule extraction. In Proceedings of the 14th International Joint Conference on Artificial Intelligence, Montreal, Canada, 1995, 480-485.
    [169] Sestito S, Dillon T. Knowledge acquisition of conjunctive rules using multilayered neural networks. International Journal of Intelligence Systems, 1993, 8(7): 779-805.
    [170] Craven M W, Shavlik J W. Extracting tree-structured representations of trained networks. In Advances in Neural Information Processing Systems 8, Cambridge, MIT Press, 1996, 24-30.
    [171] Thrun S. Extracting rules from artificial neural networks with distributed representations. In Advances in Neural Information Processing Systems 7, Cambridge, MIT press, 1995, 505-512.
    [172] Filer R, Sethi I, Austin J. A comparison between two rule extraction methods for continuous input data. In Proceedings of the NIPS’96 Workshop on Rule Extraction from Trained Artificial Neural Networks, Denver, CO, 1996, 38-45.
    [173] Saito K, Nakano R. Rule extraction from facts and neural networks. In Proceedings of the International Neural Network Conference, Paris, France, 1990, 379-382.
    [174] Tickle A B, Orlowski M, Diederich J. DEDEC: A methodology for extracting rule from trained neural networks. In Proceedings of the AISB’96 Workshop on Rule Extraction from trained neural networks, Brighton, UK, 1996, 90-102.
    [175] Craven M W, Shavlik J W. Using sampling and queries to extract rules from trained neural networks. In Proceedings of the 11th International Conerence on Machine Leaning, New Brunswick, NJ, 1994, 37-45.
    [176] Giles C L, Miller C B , Chen D, Chen H H, Sun G Z, Lee Y C. Learning and extracting finite state automata with second-order recurrent neural networks. Neural Computation, 1992, 4(3): 393-405.
    [177] Omlin C W, Giles C L, Miller C B. Heuristics for the extraction of rules from discrete time recurrent neural networks. In Proceedings of the International Joint Conference on Neural Networks, Baltimore, MD, 1992, Vol.1, 33-38.
    [178] Giles C L, Omlin C W. Extraction, insertion, and refinement of symbolic rules in dynamically driven recurrent networks. Connection Science., 1993, 5(3-4): 307-328.
    [179] Omlin C W, Giles C L. Extraction of rules from discrete-time recurrent neural networks. Neural Networks, 1996, 9(1): 41-52.
    [180] Schellhammer I, Diederich J, Towsey M, Brugman C. Knowledge extraction and recurrent neural networks: An analysis of an Elman network trained on a natural language leaning task. Technical Report: QUT-NRC Tech. Rep. 97-IS1, Queensland University of Technology, Brisbane, Australia, 1997.
    [181] Elman J L. Finding structure in time. Cognitive Science, 1990, 14(2): 179-211.
    [182] Jorgensen, M.; Experience with the accuracy of software maintenance task effort prediction models. IEEE Transactions on Software Engineering, 21(8): 674 –681, Aug. 1995.
    [183] A. Lapedes, Farber R. Nonlinear signal prediction using neural networks: prediction and System modeling. Los Alamos National Laboratory, Tech.Report, LA-UR-87-2662, 1987.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700