用户名: 密码: 验证码:
决策树技术在学生成绩分析中的应用研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着数据库技术的迅速发展以及数据库管理系统的广泛应用,人们采集数据的能力得到了极大的提高,从而积累了大量的数据。这些数据的背后隐藏着许多重要的、有价值的信息,人们为了对这些数据进行更高层次的分析,获取这些潜在的信息以指导今后的工作、生活,由此产生了数据挖掘技术。数据挖掘,是指从大量的、不完全的、有噪声的、模糊的、随机的实际应用数据中,提取隐含在其中的、人们事先不知道的、但又是潜在有用的信息和知识的过程。近十几年来,数据挖掘技术已得到广泛的研究,并在商业、金融、医疗等领域得到成功地应用,但在教学方面的应用比较少。
     由于高校连年扩招,造成了在校生人数规模剧增,给高校的教学工作带来了许多影响教学质量的问题。本论文以笔者所工作的学校为例,提出了一种应用决策树技术来挖掘隐藏在学生成绩背后有价值信息的研究方案,目的是对教师今后的教学工作提供重要的决策依据。
     决策树技术是数据挖掘分类和预测的主要技术,是通过一组无次序、无规则的实例中推理出决策树表现形式的分类规则。决策树方法与其它分类方法相比具有可理解性、易训练、易实施和通用性等优点,所以本论文选择将决策树技术应用到学生成绩分析研究中。
     基于数据挖掘的研究现状,笔者将把决策树技术应用到学生成绩的分析研究中,以提高教学质量。本文主要进行了以下几个方面的研究工作:
     1、数据挖掘基本知识的深入研究及探讨。在介绍数据挖掘基本概念的基础上,对数据挖掘的对象、可发现的模式进行了详细的分类、归纳和总结,并对数据挖掘常用技术进行了分析。
     2、决策树技术的分析与研究。通过第三章,详细分析了应用决策树技术挖掘数据信息的具体步骤,并对决策树的主要算法及其基本思想进行了归纳、分析和研究,对各种算法之间的差别进行了客观地比较。
     3、分析了现有对学生成绩分析的不足。针对对学生成绩分析的重要性和作用,提出将数据挖掘技术应用于成绩分析中的重要意义。
     4、完整地介绍了决策树技术在学生成绩分析挖掘中的全过程。第五章所讨论的内容是论文的核心。采用调查表等方式收集数据,并对数据进行了预处理操作,利用C4.5算法生成了学生成绩分析决策树模型,并由此产生了分类规则。
With the development of database technology as well as the widespread application of database management system, the capability of collecting data was improved rapidly, and lots of data have been accumulated. There was a large amount of valuable information hiding behind these data. In order to analyze these data in a higher level and obtain the latent information to instruct work and life, the Data Mining technology is brought forward. The Data Mining is a procedure of distilling available information and knowledge from mass, incomplete and random data. The Data Mining had been applied and studied in these years, and it has been applied in many domains successfully, such as business, finance and medical treatment. However, little is applied in teaching.
     Because of the sharp increase of the enrolled students’number, there are lots of problems which can influence the quality of teaching in the universities. For the purpose of providing reliable suggestions for teachers, this dissertation, taking the author’s university for example, puts forward a research project of scooping out the valuable information behind the students’results by using the Decision Tree.
     The Decision Tree technology, which is the main technology of the Data Mining classification and forecast, is the classifying rule that infers the Decision Tree manifestation through group of out-of-orders, the non-rule examples. The Decision Tree exceeds the others in the feature of well understanding, well training and achievable. Therefore, the Decision Tree technology was used in research of student result analysis.
     Based on the research background of the Data Mining, the Decision Tree was applied to result analysis for the purpose of improving the teaching quality. This dissertation has finished the following works:
     1. The research and discussing of the basic knowledge of Data Mining. Based on the introduction of elementary concept of Data Mining, the objection and findable mode are classified and concluded detailedly. In addition, some regular Data Mining technology is analyzed.
     2. The technology of Decision Tree is discussed detailedly. In the third chapter, the concrete steps of excavating data are analyzed. And the main arithmetic and basic ideas of Decision Tree are discussed and concluded detailedly. In addition, the difference between each arithmetic has been compared objectively.
     3. The shortcomings of the present students’result analysis are pointed out. In view of the importance and function of the students’result analysis, the vital significance of applying the Data Mining in the analysis is proposed.
     4. The entire procedure of the Decision Tree technology excavating the students’result analysis has been completely introduced. The fifth chapter is the core of this thesis. Use the examination table and such other ways to collect and pretreat data; The C4.5 arithmetic was used to generate a Decision Tree, and consequently the classification rule was obtained.
引文
1、范明、孟小峰等译,数据挖掘概念与技术[M],北京,机械工业出版社,2000
    2、Han Jiawei 、Kamber Micheline, 数据挖掘:概念与技术(影印版)[M],北京,高等教育出版社,2000
    3、翁敬农译,数据挖掘教程[M],北京,清华大学出版社,2003
    4、邵峰晶、于忠清编著,数据挖掘原理与算法[M],北京,中国水利水电出版社,2003
    5、刘同明等,数据挖掘技术及其应用[M],北京,国防工业出版社,2001
    6、(美)Mehmed Kantardzic著,闪四清等译,数据挖掘——概念、模型、方法和算法[M],北京,清华大学出版社,2003
    7、(意)Paolo Giudici著,袁方等译,实用数据挖掘[M],北京,电子工业出版社,2004
    8、(美)Olivia Parr Rud著,朱扬勇等译,数据挖掘实践[M],机械工业出版社,2003
    9、陈文伟等著,数据挖掘技术[M],北京,北京工业大学出版社,2002
    10、Shortland R. Scarf R. Digging for Gold. IEE Review 1995,41(5):213-217
    11、Berry M, Linoff G. Data Mining Techniques: For Marketing , Sales , and Customer Support [M]. New York : John Wiley &Sons , 1997
    12、董彩云等,数据挖掘及其在高校教学系统中的应用[J],济南大学学报(自然科学版),2004.18(1):65-68
    13、张儒良等,数据挖掘技术在高校决策支持中的应用[J],贵州民族学院学报(哲学社会科学版),2004(2)
    14、陶兰等,数据挖掘技术在高等学校决策支持中的应用[J],中国农业大学学报,2003.8(2):39-41
    15、韩冬,数据挖掘在学分制教学管理中的应用[J],教育信息化,2006.4:69-70
    16、蔡勇等,数据挖掘技术在生源分析中的应用研究[J],计算机应用研究 2004.21(12):179-181
    17、时希杰等,基于粗糙集理论的研究生招生预测[J],微计算机应用,2005.26(1): 8-10
    18、康振华等,数据挖掘在高校就业工作建设中的应用[J],现代化计算机,2006.5: 107-109
    19、唐晓萍,数据挖掘与知识发现综述[J],电脑开发与应用,2002.15(4): 31-32
    20、姚家弈等,决策树算法的系统实现与修剪优化[J],计算机工程与设计,2002.23(8):75-77
    21、倪现君,基于数据挖掘分类技术的高校教学方法研究[J],科学技术与工程,2006.6(4):390-392
    22、包晓安等,基于ID3算法的快速分类方法研究[J],现代电子技术,2004.27(7):84-85
    23、谈恒贵等,数据挖掘分类算法综述[J],微型计算机与应用,2005.2:4-9
    24、丁志斌等,数据挖掘在高校学生学习成绩分析中的应用[J],计算机工程与设计 2006.27(4):590-592
    25、李委,关联规则挖掘算法研究,西南交通大学硕士学位论文,2004,14-17
    26、栾丽华等,决策树分类技术研究[J],计算机工程,2004.30(9):94-96
    27、J.R.Quinlan ,Induction of Decision Trees[J],Machine Learning, 1986.1(1)86-106
    28、杨明等,决策树学习算法ID3的研究[J],微机发展,2002.12(5):6-9
    29、J.R.Quinlan,C4.5:Programs for Machine Learning, San Mateo, Calif: Morgan Kaufmann,1993
    30、L.Breiman,J.H.Friedman,R.A.Olshen,and C.J.Stone,Classification and Regression Tree.Wadsworth, Belmont, 1984
    31、Emily H. Thomas and Nora Galambos, What Satisfies Students?Mining Student-Opinion Data with Regression and Decision Tree Analysis ,Research in Higher Education,2004,No 3,251-269
    32、Fredric Cohen,Mining Data to Improve Teaching ,Association for Supervision and Curriculum Development,2003,5,53-57
    33、Jing Luan,Data Mining and Its Applications in Higher Education ,New Directions for Institutional Research,2002,no 113,17-36
    34、Minos Garofalakis、Dongjoon Hyun,Building Decision Trees with Constraints ,Data Mining and Knowledge Discovery 2003.2 187-214
    35、Masahiro Terabe、Takashi Washio、 Hiroshi Motoda、 Osamu Katai、 Tetsuo Sawaragi,Attribute Generation Based on Association Rules,Knowledge and Information Systems 2002.3 329-349
    36、J S Park、M S Chen、P S YU,An Effective Hash Based Algorithm for Mining Association Rules[ J],P roc.ACM SIGMOD, 1995.5. 175-186
    37、J.Han , J.pei and Y.Yin. Mining Frequent patterns without candidate enerateion In proc. 2000 ACM-SIGMOD Int Conf Management of Data[C] ,2000.5:1-1
    38、戴南,基于决策树的分类方法研究,南京师范大学硕士学位论文,2003,16-18
    39、朱应庄等,一种两阶段决策树建树方法及其应用[J],计算机工程,2004.30(1):82-84
    40、王晓国等,应用C4.5构造客户分类决策树的方法[J],计算机工程,2003.29(14):89-91
    41、张瑞欣等,数据仓库建设中的数据预处理[J],计算机系统应用,2002.1:18-21
    42、菅志刚等,数据挖掘中数据预处理的研究与实现[J],计算机应用研究,2004.21(7):117-118
    43、刘红岩等,数据挖掘中的数据分类综述[J],清华大学学报(自然科学版),2002.42(6):727-730
    44、罗可等,数据挖掘中分类算法综述[J],计算机工程,2005.1:3-5
    45、张维东等,利用决策树进行数据挖掘中的信息熵计算[J],计算机工程,2001.3:71-72
    46、黄晶晶等,分类挖掘在大学生智能评估系统中的设计与实现[J],计算机与现代化,2005.3:96-98
    47、王静红等,决策树算法的研究及优化[J],微机发展,2004.14(9):30-32
    48、韩慧等,数据挖掘中决策树算法的最新进展[J],计算机应用研究,2004.21(12):5-8
    49、王熙照等,决策树简化(剪切)方法综述[J],计算机工程与应用,2004.40(27):66-69
    50、数据挖掘技术及应用现状: http://www.ccw.com.cn/cio/research/info/htm2004/20041209_103JI.asp

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700