数据挖掘在银行信贷业务中的应用
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
数据挖掘技术是一门运用了人工智能、机器学习、统计学等多个领域理论和技术的新兴交叉学科,可以为企业提取隐含在大量历史数据中,但却潜在有用的信息和知识。准确运用这项技术,可以为企业决策提供强大有力的支持。
     数据挖掘技术发展到今天,已经日臻成熟,被广泛应用于金融、电信、保险、电力等多个领域并取得丰硕成果。
     本文运用数据挖掘技术,提出了一个基于数据挖掘的信贷分析系统的设计和实现方法。并通过具体的数据挖掘实验,对信贷业务数据进行挖掘,并对挖掘结果进行了解释与评估,证明了挖掘模型的可行性和有效性。
In the past, due to the constraints of the level of data processing means, databases capacity, computer running speed and so on, the branches of domestic commercial banks have their own customer information databases, credit information databases and report the raw data after a simple statistics summary. Such lagging analytical methods and tools can only provide superficial credit business data for the upper leaders to make decisions. Because managers are unable to fully grasp the internal and external information and lack of information exchange, they can’t correctly evaluate the credit assets risks which lead to wrong decisions making.
     At present there are a variety of information systems in China's banking industry which generally are used to complete a wide range of counter services, such as savings system, accounting systems, credit card system, etc. Some banks are developing the comprehensive counter business system that integrates all kinds of counter businesses, focusing on improving the management efficiency of business operating. As long as we observe and analyze the systems of all banks, we will find the contents, models and basic functions of various systems are same and at most the selected hardware and software platform are different, so the large capital all banks invested are repeated constructions. All banks did not outsource transaction processing system, not break away from the tie of transaction processing, not pay attention or never query and analyze the existing customer information in order to identify potentially useful information. For a long time all banks stay on the scale benefit stage by enlarging scale and preempting sites to obtain the scale benefit. Through the organization establishment, branch setting, and personnel inputs, they find the real output benefits did not reach the desired effect. Currently competition becomes more severe with the increasing number of domestic financial institutions and foreign banks that have compete for China's market, scale expansion is no longer an effective management tool. Major banks will have to turn their attention to mining and re-use of information to pursuit the depth benefits. Banks must shift from the blindfold hardware investment to the purposive software investment, pay attention to customer relationships and customer value, risk management but not the quantity of transactions, keep long-term relationships with key customer , attract and lock specific client base; pay attention to customer orientation and customer information analysis but not the casting-net-style business extension, accurately choose a separate customer base by the analytical tools and experience, sell different bank products and service to different customers purposefully. These have become urgent problems need to solve in the process of information construction for domestic commercial banks ,such as how to effectively make use of such a large quantity of business data and supply effective intelligent support for business decision-making; how to establish application system such as business analysis and forecasting on the base of processing information to provide accurate and efficient decision-making support services for banking staff . More and more people have recognized these problems, and made a lot thought and research on data mining applications in the banking business.
     The rise and development of data mining technology provided a new starting point for the information construction of banking industry. Data mining uses of cluster analysis, neural networks, decision trees and other technologies to extract potential, unknown useful information, patterns and trends from large amounts of data by means of artificial intelligence and advanced statistical techniques. At present, management based on data mining have been widely used in many advanced enterprises. In banking industry , as bank products have a fairly homogeneity, so the difference among banks often lies in which bank controls the customer relationship, as well as vast amounts of business and the unique business rules behind customer information, and it can make decision scientifically decision-making .While this is just the problem that the data mining technology will solve. The combination of data warehouse and data mining has been the research focus to solve this problem.
     This article describes the concepts of data mining, data warehouse and the main techniques and methods to use; analyzes the present situation of data management and application in China’banking industry; summarizes the construction approaches of subject-oriented banking data warehouse ; discusses the construction methods of system models of customer classification, risk prediction and performance evaluation on the demand background of China Construction Bank Hongshan branch. Establishing data warehouse is the base of performance evaluation, customer classification and risk prediction. This article uses Microsoft SQL Server 2000 and Analysis Services data warehouse solutions. In the system of customer classification and risk prediction , we use method of MS Decision Tree provided by Analysis Services to generate decision tree. Through this decision tree, a simple prediction strategy can form. This rule used for judging new customers ,we can quickly get a rough classification results, and then predict risk. In addition, the article discusses the decision tree generation algorithms and pruning algorithms, and analyzes the advantages and disadvantages of a general decision tree method and some improved algorithm. The use of multi-dimensional sequential pattern method can analyze and forecast the customer's behavior sequence models according to the customer in the past records of transactions and customer basic information , and achieve the purpose of risk prediction at the same. This paper presents the PFP-tree-based multi-dimensional sequential pattern mining method that is FP-Tree Algorithm, and at last proves the effective, compactness and completeness of the PFP-tree method. In the customer Profitability system, we use the OLAP data warehouse approach based on the theme of customer manager, as well as the agencies. This method is based on the multi-dimensional data sets of data warehouse, and Analysis Services provides "deepening", "shallow" and a series of analysis tools for data.
引文
[1]数据挖掘讨论组,数据挖掘资料汇编, http://datamining.126.com
    [2]谢榕,数据挖掘与决策支持系统,计算机系统应用,1999年第8期
    [3]石莉,知识发现中的序列模式开采研究,武汉大学硕士毕业论文,2001年4月
    [4]常海滨,数据挖掘-数据库技术的新时代,www.china-pub.com,2001年8月
    [5]丁夷,数据挖掘-技术与应用综述,西安邮电学院学报,1999年6月
    [6]宋志凯,Web个性化服务技术及其应用研究,武汉大学硕士毕业论文,2002年5月
    [7] Jiawei Han and Micheline Kamber(加).数据挖掘概念与技术,机械工业出版社,2001年8月
    [8] W. H. Inmon, Building the Data Warehouse , Awiley QED Publication , John Wiley & Sons ,inc. , 1993年
    [9]石向星,医疗保险数据仓库及数据挖掘技术,武汉大学硕士毕业论文,2001年5月
    [10]王广宇,基于决策支持的数据挖掘应用,中国证券业协会信息技术委员会,2003年1月
    [11] SAP金融总监,金融体系面临风险考验银行需进行知识化,互联网周刊
    [12]北大高科网站,什么是联机分析处理(OLAP),http://www.pku-ht.com/
    [13]马光志等,数据仓库、联机分析处理和联机分析开采研究,计算机应用研究,1999年第11期
    [14]张振华,数据仓库、数据挖掘在银行中的应用,《中国金融电脑》,1998年第10期
    [15]数据挖掘讨论组,数据挖掘入门,http://datamining.126.com
    [16]丁景德,基于数据仓库的银行商务智能系统的研究和数据挖掘工具的开发,北京理工大学硕士学位论文,2002年1月
    [17]王向星等,从客户资源中淘金-基于数据仓库的CRM在银行业中的应用,CTI论坛,2002年12月
    [18]晓函,数据仓库在银行经营管理中的作用,海脉网络经济周刊,2000年9月
    [19]夏红霞等,银行数据仓库系统的设计,计算机应用,2002年3月
    [20]郭世亮,数据仓库的设计与优化,计算机世界报,2002年13期
    [21] Warigon S. Data Warehouse Security [DB/OL] http://datawarehouse. dcj.com , 1998年10月
    [22]周晓光等,数据仓库的安全与对策研究,武汉理工大学学报,2002年6月
    [23]赵晓娟,基于数据挖掘技术的油品销售决策支持系统设计与实现,上海师范大学硕士毕业论文,2002年4月
    [24] R. Agrawal and R.Srikant , Mining sequential patterns , In Proc. 1995 Int. Conf. Data Engineering (ICDE’95) , pages 3-14 , Taipei , Taiwan , Mar . 1995年
    [25]李宏等,一种序列模式的概念及挖掘算法,中南工业大学学报,2001年8月
    [26]刘小虎、李生,决策树的优化算法,软件学报,1998年10月
    [27]杨明,张载鸿,决策树学习算法ID3的研究,微机发展,2002年5期
    [28]齐洪钢,基于数据库的数据挖掘技术的研究及其在工业中的应用,东北大学硕士毕业论文,2002年2月
    [29]谭旭等,利用决策树发掘分类规则的算法研究,云南大学学报,2000年10月
    [30]朱绍文等,决策树裁决技术及发展趋势,计算机工程,Vol.26 No.10 2000年10月
    [31] R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proc.1994 Int. Conf. Very Large Data Bases, pages 487-499, Santiago, Chile, September 1994.
    [32] Helen Pinto, Jiawei Han. Multi-dimensional Sequential Pattern Mining . Proc. 2001 Int. Conf. on Information and Knowledge Management(CIKM’01), Atlanta , GA, NOV. 2001.
    [33] J. Han, J. Pei, and Y. Yin. Mining Frequent Patterns without Candidate Generation. Proc. 2000 CM-SIGMOD Int. Conf. on Management of Data (SIGMOD'00), Dallas, TX, May 2000.
    [34] J. Pei, J. Han, B. Mortazavi-Asl, and H. Zhu. Mining Access Patterns Efficiently from Web Log. Proc. 2000 Pacific-Asia Conf. on Knowledge Discovery and Data Mining (PAKDD'00), Kyoto, Japan, April 2000.
    [35]沈兆阳,MS SQL Server 2000 OLAP解决方案-数据仓库与Analysis Services,清华大学出版社,2001年9月
    [36] Jussi Ahola , Data Mining Case Studies in Customer Profiling , LOUHI-project , 2001.11
    [37] R. Agrawal , T. Imienlinski and A. Swami. Database mining: A Performance Perspective. In IEEE Trans . on Knowledge and Data Engineering , Vol.5, No.6 , Pages 914-925 , Dec. 1993
    [38]何炎祥,“KDD技术在超级市场中的应用”,计算机工程与应用,1999. vol.35, No.5 , 126-128
    [39]何炎祥、彭锋、宋文欣,“基于网络环境的分布式KDD及Data Mining研究”,小型微型计算机系统,1999.10, vol.20, No.10, 744-746
    [40]何炎祥、彭锋、李世平、宋文欣,“分布式数据开采研究”,小型微型计算机系统,2001.2, vol.22, No.2, 191-194
    [41]何炎祥、张戈、石莉、李超、黄浩,“关联规则的维护技术”,计算机工程与应用,2002年第10期

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700