数据挖掘在银行业务中的应用
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
本论文丰要围绕银行增值业务的课题进行设计和实现。通过对增值业务的信息进行分析,围绕客户、产品和竞争对手三个主题,我们构建银行增值业务数据仓库,侧重对分析人员和高层管理人员进行决策支持,以便他们准确和及时地掌握企业的经营状况,了解市场需求,制定正确的经营方案。
     本文提出了一种新的数据分析方法进行数据挖掘,其结合了粗糙集理论和概率统计学中的多元线性回归模型,在对其进行一定的改进后同充分发挥两者的优点时。运用该方法在对大量银行业务记录进行分析,不仅从中找出业务规律,并通过影响算予对其进行二次加工,从中得出直观的参数函数,进而获得分析信息和决策依据。
     改进后的算法在进行数据挖掘的环节中,加入频度属性这个参数,建立带有频度属性的决策表。频度属性F记录的是对象x在知识库中出现的次数,取值范围为正整数,它既不属于条件属性集,也不属于决策属性集。对特征集集合进行多步递推运算,排除其中的小特征集,并提取符合条件的大特征集[X_k],将规则[X_k]→Y[Sup][Con]添加到规则集合中;同时对符合支持度的候选特征集进行重新组合,牛成更多元的特征集,从而生成反映决策偏好信息的决策规则。
     新的算法对生成的带频度属性的决策规则进行约简处理,得出最简决策规则。然后再将这些规则作为样本进行多元线性回归分析,并通过影响算予对其进行再次加工,建立相应的多元线性回归模型。然后根据样本及回归模型找到局部最优回归子集,并由此建立新的多元线性回归模型,最后采用最小二乘法对新的回归模型中的待估回归系数进行估计,求得待估回归系数,从中得出直观的参数函数A_(0i)=β_0+β_1A_1+β_2A_2+…+β_pA_p=f_i(A_1,…,A_p),i=1,2,…,p。用户将可以方便地运用这组函数来查看决策依据和获得直观地分析信息。
The paper presented mainly involves in the designing and implementation a project of bank value-added service. Through the analysis to the value-added service's information, the customers, the products and the competitor, we construct a data warehouse for bank value-added services. We put the emphasis on policy-making support to the analysts and senior management staff, so that they can accurately and timely know the operation state of enterprise, understands the market requirement, and make the correct plan.
     The paper proposes one recent a new data analysis method to carry on data mining. The method unifies the rough set theory and the multi-dimensional linear regression model in probability statistics, and fully plays their merits after make some improvement to the model. By using the improved method, we analyze the massive records of banking. We not only discover the service rules and carry on twice processing to data with the influence operator, but also obtain the intuitive parametric function and get the analysis information and the policy-making basis.
     The improved algorithm increases the frequency attribute parameter and establishes the decision-making table with frequency attribute in carries on the data mining in the link. What frequency attribute F record is the object x the number of times which appears in the knowledge library, the value scope is the positive integer, it already does not belong to the condition attribute collection, also does not belong to the policy-making attribute collection. Carries on many step recursion operations to the characteristic collection set, removes small characteristic collection, and withdraws conforms to the condition big characteristic collection[X_k], increases the rule [X_k]→Y[Sup][Con] to the regular set in; Meanwhile to conform to the support candidate characteristic collection to carry on combines the production is more Yuan characteristic collection, thus production reflection decision-making by chance information decision rule.
     The new algorithm carries out the reduction processing to the generated decision rules containing the frequency attribute and obtains the simplest decision rules. Then these new rules are taken as the sample and analyzed through the multivariate linear regression and are executed reprocessing by means of the influence operator. The corresponding multiple linear regression model was established and the local optimum subset was found according to the sample. From them the new multiple linear regression model was built. Finally the least squares method was used to the new regression model to estimate the regression coefficient to carry on the estimate, obtained treats estimates the regression coefficient A_(0i)=β_0+β_1A_1+β_2A_2+…+β_pA_p=f_i(A_1,…, A_p), i=1, 2,…, p, obtained the direct-viewing parametric function. The user might utilize this group of functions to examine that conveniently the decision-making rests on and obtains intuitively the analysis information.
引文
[1]金绢.商业银行数据仓库项目投资回报分析.中国金融电脑.2004(9):63-66
    
    [2]国内外金融信息化的发展历程与趋势分析.中国金融电脑.2004(8)
    
    [3]王闯舟.数据仓库技术及其在银行业的应用.中国金融电脑.2004(2)
    
    [4](美)波尼阿著,段云峰等译.数据仓库基础.第一版.北京:电子工业出版社,2004 1
    
    [5]段云峰,吴唯宁,李剑威,韩洁.数据仓库及其在电信领域中的应用.第一版.北京: 电子工业出版社,2003
    
    [6]庞瑞江.金融信息化的新热点与新思路—第十一届中国国际金融(银行)技术暨设 备展览会侧记.中国金融电脑.2003(10)
    
    [7] PMML http://www.dmg.org/PMML-2 1
    
    [8] XMLA http://xmla.org
    
    [9] CWM http://ww.omg.org
    
    [10] JOLAP http://www.icp.org
    
    [11] MDX http://msdn.microsoft.com
    
    [12] CRISP-DM http://www.crisp-dm.org
    
    [13] BPM-1 http://www.olapcouncil.org
    
    [14]李雄飞,李军.数据挖掘与知识发现[M].北京:高等教育出版社,2003
    
    [15]董立岩数据挖掘技术在交通事故分析中的应用[J]吉林大学学报(理学版) 2006年44卷第6期
    
    [16]梁循数据挖掘算法与应用[M]北京大学出版社
    
    [17]罗援李晓中国道路交通事故中的因素影响分析[J]公路与汽运2001 (3):19-20
    
    [18]中山大学数学系概率论及数理统计[M]高等教育出版社1978
    
    [19]宋协和城市交通拥挤的思考[J]安全与健康2004年04期
    
    [20]Weisberg,,S..应用回归分析(中译本)第一版[M]中国统计出版社,1998
    
    [21]Freedman,D.et al.统计学(中译本)第一版[M].中国统计出版社, 1997
    
    [22]陈家鼎等.数理统计学讲(第二版)[M]高等教育出版社, 1997
    
    [23]高惠璇应用多元分析(第一版)[M]北京大学出版社, 2005
    
    [24]王松桂等线性统计模型(第一版)[M]高等教育出版社, 1999
    
    [25] Efroymson, M. A.. Multiple Regression Analysis in Proceedings' Multipleregression analysis' (A. Raston and H.S. Vilf ed.) [M] John Wiley 1960:191-203
    
    [26] SAS Institute, SAS/STAT 9.1 User's Guide [M]. SAS Publishing 2004:6
    
    [27] Draper, N. R.., Smith, H.. Applied Regreassion Analysis [M] John Wiley & Sons,1998
    
    [28] Shang Wei, The Analysis Of Multidimensional Association Rule In TrafficAccidents[J], Computer Applications And Software 2006 Vol.23 No 2 P.40-42
    
    [29] Li Yen Chang,, Wen-Chieh Chen, Data mining of tree-based models to analyzefreeway accident[J] frequency Journal of Safety Research 2005 Vol.36 No 4P.365-375
    
    [30] Li Xiong Fei, Yuan Sen Miao, Dong Li Yan, Research of Data Mining Based onAssociation Rule[J]. Natural Science Journal of Jilin University of Technology, 2000Vol.30 No 2 P.43-46
    
    [31]Tang,Z.H.,McLennan,J.数据挖掘原理与应用:SQL Server 2005数据库[M]
    
    [32]刘清Rough集作Rough推理[M]北京:科学出版社,2001
    
    [33]王国胤Rough集理论与知识获取[M]西安:西安交通大学出版社, 2001
    
    [34]史忠植知识发现[M]北京:清华大学出版社, 2002
    
    [35] HAN J, KAMBR M. Data mining: concepts and techniques [M] Morgan KaufmannPublishers, 2000: 130-138
    
    [36] GRZZYMALA - BUSSE J W. LERS: A system of knowledge discovery based on rough sets [A] Proc. Of 5~(th) Intl. Workshop RSFD' 96 [C]. Tokyo, 1996:443-444
    
    [37]程继华,施鹏飞多层次规则挖掘的约略集方法[J]上海交通大学学报1998, 32(9):79-81
    
    [38] SHAN N, ZLARKO W. Data - based acquisition and incremental modification of classification rules [J] Computational Intelligence 1995, 11:357-370
    
    [39] MOLLESTAD T, KOMOROWSKI J. A rough set framework for data mining of prepositional default rules [M] Rough Fuzzy Hybridization [D] Springer 1999: 233-262
    
    [40] HU.X.H, CERCONE N Learning maximal generalized decision rules via discrimination, generalization and rough set feature reduction [A]. Proc, Of the 9~(th) IEEE Int. Conf. On Tools with Artificial Intlligence [C] 1997: 548-556
    
    [41] ZIARKO W. Variable precision rough set model [J] Computer & System Science 1993, 46(1):39-59
    
    [42] NAKAMURA A. A rough logic based on incomplete information and its applications [J] International Journal of Approximate Reasoning, 1996, 15:367-378
    
    [43]何华灿泛逻辑学原理[M]北京:科学出版社, 2001
    
    [44] PAWLAK. Z. Rough sets [J] International J of computer and Information Sciences, 1982, (11) :341-356
    
    [45]谢康肖静华著<网络银行> 长春:长春出版社,2000.1
    
    [46][英]詹姆斯.埃森格著,张荔,张东辉译《虚拟银行革命》沈阳:辽宁教育出版 社
    
    [47]徐支建 宋炳方 王鹏虎著《银行客户开发与管理》北京:中国金融出版社, 1999.6
    
    [48]黄津孚 编著《学位论文写作与研究方法》北京:经济科学出版社2005,5

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700