个人信用评分混合模型研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着中国经济的快速发展,各种个人消费信贷业务的规模迅速扩大。但是,由于目前国内商业银行对零售业务的风险管理水平较低,管理手段与技术方法相对落后,没有形成有效的自动化的基于个人信用评分模型的风险管理体系,这严重阻碍了个人消费信贷业务的发展。因此,开发出一套能够有效降低个人信用风险的信用评分方法,对社会经济的发展具有十分重要的意义。本文建立的个人信用评分混合模型可以有效降低商业银行的个人信用风险,更好地实现银行利润最大化的目标。
     本文包含以下几方面的内容:
     第一章引言,指出问题的研究背景及意义,论述了个人信用评分系统在消费信贷风险控制过程中的重要性,概述了国内外信用评分的发展和现状,并对现有的理论研究成果加以总结。
     第二章详细介绍了三种分类方法用以建立信用评分模型,它们是Logistic回归,分类树和随机森林算法,本文选取的三种方法都很有代表性,其中Logistics回归是目前商业银行使用最广泛的参数统计方法,分类树则是使用最广泛的非参数方法,而随机森林算法是数据挖掘领域较为成功的算法。
     第三章研究个人信用评分模型的检验方法,如何判定一个模型的有效性,我们列举了三种理论界和实用界常用的方法。
     第四章用真实的信贷数据对第二章提出的三种分类方法进行实证分析,结果表明三种方法都可以有效的用于个人信用评分建模。
     第五章建立个人信用评分混合模型,首先由分类树方法获取特征变量之间的交互作用项,然后引入到Logistic回归模型中,从而建立完备的Logistic回归模型;随机森林算法给出每个特征变量的重要性,为特征变量的选取提供依据。
     本文的主要创新点在于:(1)将随机森林算法引入到个人信用评分建模中,并通过实证检验其预测能力;(2)建立个人信用评分混合模型,由分类树方法获取特征变量交互作用项,并引入到Logistic回归模型中,建立完备的回归方程。
With the rapid development of Chinese financial industry, the scale of various con-sumer credit expands quickly. But, because of the low risk management level over the retail trade from the interior commercial banks, relatively backward management means and methods, lack of an effective personal credit evaluation method, all severely hindered the development of credit business of personal consume.
     Therefore, it is very important for the development of social economy to develop an evaluation method of personal credit scoring, which is suitable for the Chinese character and can effectively lower the credit risk. This research on the mixed personal credit scoring model can reach the goal, that is to effectively lower the credit risk of commercial banks and realize maximize of the bank profits.
     In this paper, Chapter 1 gives a brief introduction of credit scoring and researches that have been done before. Chapter 2 concerns about three single methods used to build the personal credit scoring model. Chapter 3 analyze concepts and methodologies to evaluate the predict power of the credit scoring model. In chapter 4, the empirical analysis for each method in Chapter 2 is conducted using the real world credit data. For each method, the error ratio is calculated. After that, this paper consider a mixed model of Logistic model and decision tree in Chapter 5. We can use decision tree to detect the interaction for Logistic model. Empirical analysis is also done to prove that the interactions exist in the model. So the mixed model can reach the goal, that is to detect the interactions by decision tree.
     The major contribution of this article is introduce random forest method to build credit scoring model, and the empirical result is good. Meanwhile, a mixed model of Logistic and decision tree is built to manage the credit risk. Finally, we can get the conclusion the decision tree can detect the interaction for Logistic model.
引文
[1]Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J., Classification and Regression Trees, Belmont:Wadsworth,1984.
    [2]Breiman, L., Random forests, Machine Learning,45:5-32,2001.
    [3]De Servigny, A., Renault,O., Measuring and Managing Credit Risk, New York:McGraw Hill,2004.
    [4]Hand, D. J., Good practice in retail credit scorecard assessment, Journal of the Operational Research Society 56(9), September 2005.
    [5]Hand, D.J., Henley, W.E., Statistical classification methods in consumer credit scoring:A review, Journal of the Royal Statistical Society, Series A 160,523-541,1997.
    [6]Hastie, T., Tibshirani, R., Friedman, J.H., The Elements of Statistical Learning, New York:Springer-Verlag,2003.
    [7]Hoadley, B., Oliver, R.M., Business measures of scorecard benefit, IMA Journal of Math-ematics Applied in Business and Industry 9,55-64,1998.
    [8]Lee, T.S., Chiu, C.C., Lu, C.J., Chen, I.F., Credit scoring using the hybrid neural discrim-inant technique, Expert Systems with Applications 23,245-254,2002.
    [9]Lewis, E., Introduction to Credit Scoring, San Rafael:The Athena Press,1992.
    [10]Malhotra, R., Malhotra, D.K., Evaluating consumer loans using neural networks, Omega 31,83-96,2003.
    [11]Marshall, K.T., Oliver, R.M., Decision Making and Forecasting, New York:McGraw-Hill, 1993.
    [12]Oliver, R.M., Wells, E., Efficient frontier cut-off policies in credit portfolios, Journal of Operational Research Society 52,1025-1033,2001.
    [13]Ong, C.S., Huang, J.J., Tzeng, G.H., Building credit scoring systems using genetic pro-gramming, Expert Systems with Applications 29,41-47,2005.
    [14]Thomas, L.C., A survey of credit and behavioral scoring:Forecasting financial risks of lending to customers, International Journal of Forecasting,16:149-172,2000.
    [15]Thomas, L.C., Edelman, D.B. and Crook, J.N., Credit Scoring and Its Application, SIAM monographs on mathematical modeling and computation, Philadelphia,2002.
    [16]Vanables, W.N, Smith, D.M.,2008. http://cran.r-project.org/doc/manuals/R-intro.pdf
    [17]Weisberg, S., Copula:Applied Linear Regression, New York:Wiley,2005.
    [18]West, D., Neural network credit scoring models, Computers and Operational Research 27, 1131-1152,2000.
    [19]Yobas, M.B., Crook, J.N., Ross, P., Credit scoring using neural and evolutionary techniques, IMA Journal of Mathematics Applied in Business and Industry 11,111-125,2000.
    [20]陈希孺,高等数理统计学,合肥:中国科学技术大学出版社,1999.
    [21]陈希孺,王松桂,近代回归分析,合肥:安徽教育出版社,1987.
    [22]林功实,林健武,信用卡,北京:清华大学出版社,2006.5.
    [23]吕杨,个人信用评价体系构建研究—其于AHP和Logistic混合模型,南京理工大学大雨学位论文,2009.
    [24]茆诗松,王静龙,高等数理统计,北京:高等教育出版社,1999.
    [25]石庆焱,秦宛顺,个人信用评分模型及其应用,北京:中国方正出版社,2005.
    [26]石庆焱,一个基于神经网络—Logistic回归的混合两阶段个人信用评分模型研究,统计研究,Vol.5, pp.45-49,2005.
    [27]王星,非参数统计,北京:清华大学出版社,2009.3.
    [28]王晓蕾,石庆焱,吴晓惠,信用评分及其应用,北京:中国金融出版社,2005.12.
    [29]肖艳,杨国强,商业银行消费信贷业务的风险分析及对策探讨,金融纵横,Vol.1,2010.
    [30]薛毅,陈立萍,统计建模与R软件,北京:清华大学出版社,2007.
    [31]叶凯,个人信用混合两阶段评估方法研究,哈尔滨工业大学硕士学位论文,2006.
    [32]赵自强,郑明,应用分类树模型筛选Logistic回归中的交互作用,中国卫生统计,Vol.24,No.2,2007.4.