数据挖掘技术在成人高校管理中的应用研究

英文题名：Research on the Application of Data Mining on Management of Adult College
作者：宋向红
论文级别：硕士
学科专业名称：计算机应用技术
中文关键词：数据挖掘 ; 决策树 ; 关联规则 ; 聚类
英文关键词：data mining ; decision tree ; association rule ; cluster
学位年度：2011
导师：石磊
学科代码：081203
学位授予单位：郑州大学
论文提交日期：2011-04-01

摘要

随着数据挖掘技术的成熟及其应用领域的扩展,不少普通院校的研究人员已开始将其应用于普通高校的管理中并得到了相关研究结论。由于成人高校和普通院校相比在学生来源、教学模式、管理方式等方面有所不同,直接将普通院校的研究结果应用于成人高校中有所不妥,所以采用数据挖掘技术对成人高校的生源情况、学生的评教记录以及学生的考试成绩等进行研究具有重要意义。
     本文首先采用决策树分类算法ID3对平顶山教育学院往年的生源情况进行分析,生成分类规则,得到结论：年龄较小且收入较低或一般的教师是学院成人教育生源的主体。研究表明,增加新专业,扩大生源范围势在必行。
     然后采用关联规则的Apriori算法对教师评教数据进行挖掘,产生了相应的强关联规则,结果表明成人高校的学生相对较为成熟,经验丰富的老教师、职称较高的副教授和知识丰富的具有硕士学位的教师,相对评价较好。
     最后采用聚类分析的k均值算法对考试成绩进行聚类,得到了簇中心和个类数。结果显示：如果优秀率、良好率、中等率、及格率和不及格率符合正态分布,说明教学效果良好,学生对课程内容掌握较好。
With the perfection of data mining techniques, it has been applied in many fields. The researchers in some common colleges applied it to the college management and got some valuable conclusions. Because of the difference of students'source, teaching modes, management methods and so on, the research results got from the common colleges can't be applied to the adult colleges directly. So, it is significant to research the students'source conditions, teaching evaluation records, examination results and so forth by means of data mining.
     Firstly, by means of decision tree classification algorithm ID3, the students' source condition in former years of Pingdingshan Institute of Education was analyzed, and the classification rules were generated. Research results indicate that the main student source of adult education colleges is the teachers who are young and having lower or middling income, so it is necessary for the adult education colleges to add new specialties and extend student source fields.
     Secondly, based on the Apriori algorithm of association rule, the teachers' teaching evaluation data was mined, and the strong association rules were got. Research results indicate that the students of the adult colleges are relatively mature, and the experienced teachers, the teachers with associate professor professional title and the teachers with master degree and abundant knowledge have good teaching effect.
     Lastly, the K average algorithm of cluster analysis was applied to cluster to the examination results, and the cluster centers and the number of cases in each cluster were obtained. Research results indicate that if the rates of excellence, good, middle, pass, and fail is subjected to normal distribution, it shows that the teaching effect is good and the students master the course well.

引文

[1]Fayyad, Piatetskey-Shapiro, Uthurusamy and Smyth, "From Data Mining to Knowledge Discovery", Advances in Knowledge Discovery and Data Mining[FPSSe96][M], Menlo Park CA:AAAI Press,1996
    [2][美]Olivia Parr Rud著.数据挖掘实践[M].朱扬勇等译.北京：机械工业出版社,2003
    [3][加]Jiawei Han, Micheline Kamber著,数据挖掘概念与技术[M],范明,孟小峰译,北京：机械工业出版社,2001
    [4]员巧云,程刚.近年来我国数据挖掘研究综述[J].情报学报.2005,24(2)：250-256
    [5]刘芝怡,崔志明.数据挖掘在教育领域中的应用[J].福建电脑.2006(9)：191-194
    [6]何芬.数据挖掘技术在教学管理中的研究与应用[D].[硕士学位论文],武汉：武汉理工大学,2010
    [7]张莉.数据挖掘技术在高校学生成绩分析中应用的研究[D].[硕士学位论文],北京：中国石油大学,2010
    [8]江丽丽.基于数据挖掘技术的教学内容双向评价系统的设计与实现[J].科技信息,2010,(27)：79
    [9]吕爽,陈高云.数据挖掘技术在高校教学评估中的应用[J].广东广播电视大学学报.2006,3(15)：24-28
    [10]段小斌.基于数据挖掘技术的教学质量评价模型研究[J].中小企业管理与科技(上旬刊),2010,(12)：189-190
    [11]丁智斌,袁方,董贺伟.数据挖掘在高校学生学习成绩分析中的应用[J].计算机工程与设计.2006,4(27)：590-592
    [12]刘芳,林海霞.数据挖掘技术在高校计算机等级考试成绩分析中的应用[J].广西轻工业.2008,(11)：59-60
    [13]胡侃,夏绍玮.基于大型数据仓库的数据挖掘研究综述[J].软件学报,1998,9(1)：53-63
    [14]Microsoft:Data Management Exploration and Mining in SQL Server[EB].2000. http://research.microsoft.com/dmx
    [15]Ming-Syan Chen, Jiawei Han, Philip S.Y. Data Mining:An Overview from a Database Perspective[J], IEEE Transactions on Knowledge and Data Engineering.1996, 8(6):866-883
    [16]徐澜.数据仓库和数据挖掘在成人高校决策中的应用[D].[硕士学位论文].上海：上海交通大学,2008
    [17]卢东标.基于决策树的数据挖掘算法研究与应用[D].[硕士学位论文].武汉：武汉理工大学,2008
    [18]刘美玲.基于数据挖掘的决策树算法研究及应用探讨[D].[硕士学位论文].哈尔滨：东北林业大学,2006
    [19]郭秀娟.基于关联规则数据挖掘算法的研究[D].[博士学位论文].长春：吉林大学, 2004
    [20]张肖燕,杨振.数据挖掘中的聚类分析研究[J].软件导刊.2007,(9)：34-36
    [21]肖伟平,何宏.基于遗传算法的数据挖掘方法及应用[J].湖南科技大学学报(自然科学版),2009,24(3)：82-86
    [22]张美虎.神经网络数据挖掘算法的研究与应用[J].扬州职业大学学报,2009,13(2)：43-45
    [23]文小燕,杜海若,数据挖掘的发展和应用综述[J].电脑知识与技术(学术交流),2007,24(18)：1486
    [24]卢启程,邹平.数据挖掘的研究与应用进展[J].昆明理工大学学报.2002,27(5)：62-67
    [25]高翔,侯小静.数据挖掘技术综述[J].牡丹江教育学院学报.2008,(6)：109-110
    [26]SGI:Mineset[EB].2004 http://www.sgi.com/software/mineset.html
    [27]Porter B W, Baress E, Holte R. Concept learning and heuristic classification in weak theory domains[J], Artificial Intelligence,1989,45(1/2):226-263
    [28]关晓蔷.基于决策树的分类算法研究[D].[硕士学位论文].太原：山西大学,2006
    [29]Quinlan J R. Discovering rules by induction from large collections of examples[J]. Expert System in the Micro Electronic Age,1979,(6):26-37
    [30]Quinlan J R. C4.5:Programs for Machine learning[C]. Morgan Kaufmann, San Mateo, California.1993,23-30
    [31]Breiman L, Friedman J, Olshen R A. Classification and regression trees[M], Belmont: Wadsworth Press,1984
    [32]Metha M, Rissanen J, Agrawal R, SLIQ:A fast scalable classifier for data mining[C]. in EDBT96, Avignon, France.1996
    [33]周跃良.现代教育技术[M],北京：高等教育出版社.2008
    [34]刘闫锋.关联规则挖掘在教学质量分析中的研究[D].[硕士学位论文].西安：西安建筑科技大学,2010
    [35]徐辉增.关联规则数据挖掘方法的研究[D].[硕士学位论文].北京：中国石油大学,2009
    [36]刘长付.数据挖掘技术中的关联规则挖掘算法研究[D].[硕士学位论文].赣州：江西理工大学,2010
    [37]陈涛,张玮,一个改进的并行关联规则算法研究[J].计算机技术与发展.2007,17(1)：139-141
    [38]魏欣南.关系数据库关联规则挖掘算法研究[D].[硕士学位论文].哈尔滨：哈尔滨理工大学,2009
    [39]何中胜.基于向量的并行关联规则挖掘算法[J],计算机系统应用.2009,(3)：42-45
    [40]Agrawal R, Imielinski T, Swami. A. Mining Association Rules between Sets of Items in Large Database[C]. Porceedings of ACM SIGOD Conference on Mangement of Data. Washington DC, USA. May 1993,1-10
    [41]Han J. Pei J, Yin Y. Mining frequent patterns without candidate generation[J]. Dallas, TX:ACM-SIGMOD,2004,(8):53-87
    [42]Corsini P, Lazzerini B, Marcelloni F. A fuzzy relational clustering algorithm based on a dissimilarity measure extracted from data[J], IEEE Transactions on Systems, Man, and Cybernetics Part B Cybernetics, Feb 2004,34(1):775-82
    [43]行小帅,焦立成.数据挖掘的聚类方法[J].电路与系统学科.2003,1(8)：59-66
    [44]董健康.数据挖掘中CURE聚类算法研究[J].电脑与电信,2007,(4)：14-15
    [45]张红云,刘向东,段晓东,等.数据挖掘中聚类算法比较研究[J].计算机应用与软件.2003,(2)：5-6
    [46]J.Macqueen. Some methods for classification and analysis of multivariate observations[C], Proc.5th Berkeley Symp. Math. Statist,1967,281-297
    [47]R.Ng, J.Han. Efficient and effective clustering method for spatial data mining[C], Proceedings of the VLDB Conference. Santiago, Chile,1994,144-155
    [48]T.Zhang, R.Ramakrishnan, M.Livny. BIRCH:An efficient data clustering method for very large databases[C]. Proc. ACM-SIGMOD'96 Int. Conf. on Management of Data. Montreal, Canada.1996,103-114
    [49]S.Guha, R.Rastog, K.Shim. CURE:an efficient clustering algorithm for large database[J], Information Systems,2001,26(1):35-58
    [50]G Karypis, E.H.Han, V Kumar. CHAMELEON:a hierarchical clustering algorithm using dynamic modeling[J], COMPUTER,1999,32(8):68-75
    [51]M.Ankerst, M.Breunig, H.P.Kriegel and J.Sander. OPTICS:Ordering Points to Identify the Clustering Structure[C], Proc. ACM-SIGMOD'99 Int. Conf. on Management of data. Philadelphia PA,1999,49-60
    [52]石云平.基于聚类K-means算法的分析与应用研究[D].[硕士学位论文].西安：西安工业大学,2006

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700