基于支持向量机（SVM）理论的个人信用评估研究

英文题名：Based on Support Vector Machine (SVM) Personal Credit Evaluation
作者：张坤
论文级别：硕士
学科专业名称：管理科学与工程
中文关键词：数据挖掘 ; 支持向量机 ; 粗糙集 ; 信用评估
英文关键词：Data mining ; Support vector machine ; Rough set ; Credit assessment
学位年度：2011
导师：邵良杉
学科代码：1201
学位授予单位：辽宁工程技术大学
论文提交日期：2010-12-01

摘要

金融机构风险主要源于信贷风险。个人申请贷款业务的与日剧增,建立有效的风险防范机制对银行来说是迫在眉睫的。本课题在齐鲁商业银行的综合信息平台的基础,对平台中的个人信用风险评估进行了深入探讨研究,提出了一个基于粗糙集和支持向量机的个人信用评估系统模型。
     数据挖掘融合了数据库、人工智能和数理统计等多门学科,是一种从大量复杂的数据中迅速获得有用信息的新技术。分类是一种最常见的数据挖掘的应用方向,通过实验数据训练得到的分类器来预测未知数据的类别。
     支持向量机(SVM)是近年来在统计学习理论的基础上发展起来的一种新的机器学习方法,它具有很强的泛化能力。其核心思想是将一个复杂的分类任务通过核函数映射使之转化成一个在高维特征空间中构造线性分类超平面的问题。支持向量机是一种好解决两分类问题的新方法,其构造学习结果模型稳定性较好。
     本文认真研究分析了支持向量机的原理及算法。在对面向大规模数据集的支持向量机的原理及算法的研究方面,通过比较各种算法的优缺点,选用了改进的序列最小最优化算法(SMO)来提高基于SVM的个人信用评估模型的学习速度。并对整个基于SVM的银行个人信用评估系统模型进行介绍。将支持向量机应用到个人信用评估中,最后通过实验,证明建立的模型具有很好的效果。
Risk of major financial institutions is from credit risk.With business and personal applications-increasing loaning, the establishment of effective risk prevention mechanism of the banks is imminent. The subject of the Qilu commercial banks based integrated information platform, the platform of the individual credit risk assessment study conducted in-depth research, is proposed based on support vector machine model of individual credit risk assessment system. Integration of the database data mining, artificial intelligence and statistics, and other subjects, is a complex data from a large number of quick access to useful information in the new technology. Classification is one of the most common application of data mining the direction of the training received through the empirical data to predict the unknown data classification categories.
     Support Vector Machine (SVM) in recent years in statistical learning theory developed on the basis of a new machine learning method, which has strong generalization ability. The core idea is a complex classification task mapping to make it through the kernel function into a high dimensional feature space to construct the linear separating hyperplane problem. Support vector machine is a good problem to solve two new classification method, the structural stability of better learning outcomes model.
     This paper careful studys the theory of support vector machine . In large-scale data sets for support vector machine principle and algorithm research, by comparing the advantages and disadvantages of each algorithm, the choice of an improved sequential minimal optimization algorithm (SMO) to improve SVM based on individual credit evaluation model learning speed. SVM-based bank and the personal credit rating system model are introduced. Support vector machines applied to the evaluation of personal credit, the last experiment to prove that the model has a good effect.

引文

[1]刘军丽.基于数据挖掘技术的个人住房贷款信用风险评估研究.上海:上海海运学院,2005.
    [2]刘天虎.商业银行消费者住房贷款信用风险管理研究:(硕士学位论文).成都:电子科技大学,2006.
    [3]Han Jiawei,Kamber Micheline著.范明,孟小峰等译.数据挖掘概念与技术.北京:机械工业出版社,2003.
    [4]朱明.数据挖掘.合肥:中国科技大学出版社,2002.
    [5]刘红岩,陈剑,陈国青.数据挖掘中数据分类算法综述.清华大学学报(自然科学版).2002,42(6):727-730.
    [6]绍峰晶,于忠清.数据挖掘原理与算法.北京:中国水利水电出版社,2003.
    [7]武洪玲.个人住房贷款风险及其防范.安徽工业大学学报(社会科学版).2004,21(2):34-35.
    [8]Vapnik V著.许建华,张学工译.统计学习理论的本质.北京:清华大学出版社,2000.
    [9]邓乃扬,田英杰.数据挖掘中的新方法一支持向量机.北京:科学出版社2004.
    [10]康世瀛著,个人信用评估及贷款决策研究,经济问题探索,2002.9
    [11] Vapnik V.,Nature of Statistical Learning Theory,NewYork:Springer-Verlag,1999
    [12]李光著,银行消费信贷个人信用评估体系研究,北京理工大学硕士学位论文,2001
    [13]丁剑敏.数据挖掘技术及其在商业银行中的应用.市场周刊,2003,6(4):58-59.
    [14]邱仁.商业银行信贷风险管理存在的问题及对策研究[D].四川大学,2004.
    [15]朱兴德,冯铁军.基于GA神经网络的个人信用评估[J].系统工程理论与实践,2004,(12):70-75.
    [16]徐晋.基于神经网络专家系统的创业企业信用等级评估研究[J].南京理工大学学报,2004.12,28(6),684-688.
    [17]吴冲,吕静杰等.基于模糊神经网络的商业银行信用风险评估模型研究[J].系统工程理论与实践,2004,11,11:1-8.
    [18]刘学伟,贺昌政.基于贝叶斯正则化神经网络的上市公司信用评价研究[J].软科学,2005,19(5).
    [19]甄彤,范艳峰.基于支持向量机的企业信用风险评研究[J].微电子与计算机,2003,23,5:136-139.
    [20]赵晓翠,王来生.基于主成分分析和支持向量机的商业银行信贷风险评估[J].理论新探,2006,7.
    [21]尹华.数据挖掘分类技术在信用卡系统中的应用[D].武汉:武汉大学,2004.
    [22]李娟.信用卡风险管理中的数据挖掘应用[J].济南金融,2005,(03):71.
    [23]左子叶,朱扬勇.基于数据挖掘聚类技术的信用评分评级[J].计算机应用与软件,2004,21(4):1-3.
    [24]李平.信用卡业务数据挖掘应用初探[J].中国信用卡,2004,(4):38-41.
    [25]孔学峰.数据挖掘及其在信用卡风险控制中的应用[J].中国金融电脑,2003,(10):21-23.
    [26]严华,胡孟梁,蔡瑞英.防止信用卡欺诈的系统设计[J].信息安全,2006,22(12):64-66.
    [27]郑志刚,朱建秋,朱扬勇.数据挖掘技术在信用卡分析中的应用[J].计算机工程,2003,29:1-3.
    [28]郑小霞,钱锋.高斯核支持向量机分类和模型参数选择研究[J].计算机工程与应用,2006,(01):77-79.
    [29]魏大庆.基于数据挖掘的信用卡交易风险检测研究[D].成都:四川师范大学,2007.
    [30] Y YaIlg, X Lin. Are—examination of text categorization methods. In: The 22nd Annual International ACM SIGIR Conference on Research and Development in the Information Retrieval, New York: ACM Press. 1999.
    [31] Zhu Y S, Wang C D, Zhang Y Y. Experimental study on the performance of support vector machine with squared cost function. Chinese Journal of Computers,2003,26(8): 982—989.
    [32] Poutil M, Verri A. Properties of support vector machines. Neural Computation, 1998, 10(4): 955-974.
    [33] Mangasarian O L, Musicant D R. Lagrangian support vector machines. Journal of Machine Learning Research,2001,1: 161—177.
    [34] V.Vapnik.The Nature of statistical Learning Theory.Springer Verlag,New York,1995:158-187
    [35] V.Vapnik. Statistical Learning Theory.John Wiley & Sons,New York,1998:44-98
    [36] J. W Han and M. Kamber,“Data Mining:Concepts and Techniques[M]”,Simon Fraser University Press,2000,Pages 2-2l,224—258.
    [37] T. ZHANG, R.RAMAKRISHNAN, AND M. LIVNY. BIRCH: An efficient data clustering method for very large databases. SIGMOD(l996)Ree. 25,2,103-114.
    [38] J.Vitter. Random sampling with reservoir. ACM Transactions On Mathematical Software. 1985. 3. 11(1). 37-57.
    [39] T.Zhang, R.M. Livny. Ramakrishnan very large databases. In; Proceedings BIRCH. An efficient data clustering method of the ACM SIGMOD International Conference.
    [40] W.Zhang, J.Yang, R.Muntz. STING.A statistical information grid approach to spatial data mining. In: Proceedings of the 23rd VLDB Conference. Athens. Greece. 1997: 186-195.
    [41]B.E.Boser,I.M.Guyon,V.Vapnik.A Training Algorithm for Optimal Margin Classifiers[C].Fifth Annual Workshop on Computational Learning Theory,ACM,1992.
    [42]E.Osuna,R.Freund,F.Girosi.Training Support Vector Machines:An Application to Face Detection[C].CVPR'97.Washington:IEEE Computer Society,1997:130-136.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700