用户名: 密码: 验证码:
学生成绩关键因素挖掘与成绩预测
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Mining the key factors behind student performance and predicting students' examination scores
  • 作者:谢娟英 ; 张宜 ; 陈恩红
  • 英文作者:XIE Juanying;ZHANG Yi;CHEN Enhong;School of Computer Science,Shaanxi Normal University;The Third Senior Middle School of Pucheng County;School of Computer Science and Technology,University of Science and Technology of China;
  • 关键词:教育数据挖掘 ; 学生成绩分析 ; 密度全局K-means算法 ; 关联分析 ; 预测分析
  • 英文关键词:educational data mining;;student performance analysis;;density-based global K-means;;association analysis;;prediction analysis
  • 中文刊名:NJXZ
  • 英文刊名:Journal of Nanjing University of Information Science & Technology(Natural Science Edition)
  • 机构:陕西师范大学计算机科学学院;陕西省蒲城县第三高级中学;中国科学技术大学计算机科学与技术学院;
  • 出版日期:2019-05-28
  • 出版单位:南京信息工程大学学报(自然科学版)
  • 年:2019
  • 期:v.11;No.61
  • 基金:国家自然科学基金(61673251);; 中央高校基本科研业务费项目(GK201701006,GK201806013)
  • 语种:中文;
  • 页:NJXZ201903011
  • 页数:10
  • CN:03
  • ISSN:32-1801/N
  • 分类号:80-89
摘要
为了探索影响学习成绩的关键因素,为学生学习、教师教学和学校管理提供帮助,采用密度全局K-means算法对UCI机器学习数据库的葡萄牙学生数据、陕西蒲城县第三高级中学的学生数据进行聚类分析,挖掘影响学生成绩的相关因素,并对学生成绩进行预测分析.葡萄牙学生数据挖掘发现:学生成绩与其所在学校、家庭住址、母亲学历、家庭有无网络有极大相关性,与父亲受教育程度、上学路上花费时间、想上大学、在谈恋爱也有一定相关性.蒲城县第三高级中学学生数据分析发现:学生成绩与其监护人、父母年龄、父母学历、学习态度、课后学习量之间有极大相关性.成绩预测聚类结果显示:预测成绩与实际成绩一致.中外学生数据挖掘揭示:学生成绩与父母受教育程度,特别是母亲受教育程度密切相关,母亲受教育程度越高,孩子学习成绩越好;孩子成长过程中,父母作为监护人的陪伴作用不容忽视;激励和引导学生树立远大理想,调动学生学习的主动性,对学习成绩和成长至关重要;缩小城乡教育差距势在必行.
        Understanding the key factors that influence student performances will help students,teachers,and administrators to improve the performance of the students.To this end,the density-based global K-means algorithm is adopted to perform cluster analysis of the student performance data from the UCI machine learning repository for two secondary education Portuguese schools and for a senior middle school of the Pucheng county in the Shaanxi province.The results for the two Portuguese schools reveal that student performance is strongly related to the specific school where the student is enrolled,and location of residence,mother's education level,and if the network is available or not in the family.Education level of the father,the time the student takes on the way to school,the willingness of the student to go to college,and whether the student is in love are factors affecting the student performance to some extent.The results of the third senior middle school demonstrate that student performance is strong related to their guardians,parents' age,parents' education level,learning attitude of the student,and the time the student devotes to courses after classes.In addition,the results indicate that scores of a student for the upcoming examination can be predicted with the available ones and that the predicted scores coincide with the actual ones.The studies in this paper demonstrate that student performance is strong related to parents' education level,especially to mother's education level.The higher the level of education of the mother,the better the student performance.Parents cannot ignore their role in the individual growth of children.It is important to teach students to study actively to improve their achievements.Finally,it is imperative that the education gap between the urban and rural areas is narrowed.
引文
[1] Jin H J,Wang X R,Wang Y L,et al.Study and application of genetic algorithm in computer test construction[C]//IEEE International Symposium on Communications and Information Technology,2005.ISCIT2005,12-14 Oct.2005,Beijing,China,2005:424-427
    [2] 谭庆.基于K-means聚类算法的试卷成绩分析研究[J].河南大学学报(自然科学版),2009,39(4):412-415TAN Qing.Analysis and research of grades of examination paper based on K-means clustering algorithm[J].Journal of Henan University(Natural Science),2009,39(4):412-415
    [3] 王盛.教育数据挖掘促进高校学生个性化学习途径分析[J].考试周刊,2014(34):176WANG Sheng.Analysis of educational data mining in promoting individualized learning of university students[J].Exam Weekly,2014(34):176
    [4] 彭涛,丁凌云.基于教育数据挖掘学生表现预测模型构建研究[J].黑龙江高教研究,2015,33(11):55-58PENG Tao,DING Lingyun.Performance prediction model based ondata mining based education students[J].Heilongjiang Researches on Higher Education,2015,33(11):55-58
    [5] 洪雪峰.教育数据挖掘下的学习效果探析[J].长沙铁道学院学报(社会科学版),2014(2):196-198HONG Xuefeng.The exploring analysis of learning effect under educational data mining[J].Journal of Changsha Railway University(Social Sciences),2014(2):196-198
    [6] Khan Z N.Scholastic achievement of higher secondary students in science stream[J].Journal of Social Sciences,2005,1(2):84-87
    [7] Al-Radaideh Q A,Al-Shawakfa E M,Al-Najjar M I.Mining student data using decision trees[C]//The 2006 International Arab Conference on Information Technology(ACIT'2006),Yarmouk University,Jordan.2006
    [8] Hijazi S T,Naqvi S M M R.Factors affecting students performance acase of private colleges[J].Bangladesh e-Journal of Sociology,2006,3(1):90-100
    [9] Chapman D W.The shadow education system:private tutoring and its implications for planners[J].Economicsof Education Review,2001,20(6):608-609
    [10] Ayesha S,Mustafa T,Sattar A R,et al.Data mining model for higher education system[J].European Journal of Scientific Research,2010,43(1):24-29
    [11] Bhardwaj B K,Pal S.Data mining:aprediction for performance improvement using classification[J].International Journal of Computer Science and Information Security,2011,9(4):136-140
    [12] 舒忠梅,屈琼斐.基于教育数据挖掘的大学生学习成果分析[J].东北大学学报(社会科学版),2014,16(3):309-314SHU Zhongmei,QU Qiongfei.An analysis of university students learning outcome based on educational data[J].Journal of Northeastern University(Social Science),2014,16(3):309-314
    [13] 谢娟英,蒋帅,王春霞,等.一种改进的全局K-均值聚类算法[J].陕西师范大学学报(自然科学版),2010,38(2):18-22XIE Juanying,JIANG Shuai,WANG Chunxia,et al.An improved global K-means clustering algorithm[J].Journal of Shaanxi Normal University(Natural Science Edition),2010,38(2):18-22
    [14] Cortez P,Silva A.Using data mining to predict secondary school student performance[C]//Proceedings of 5th Future Business Technology Conference(FUBUTEC 2008),2008:5-12
    [15] Dua D,Karra T E.UCI machine learning repository[EB/OL].[2019-05-18].http://archive.ics.uci.edu/ml
    [16] Huang Z X.Extensions to the k-means algorithm for clustering large data sets with categorical values[J].Data Mining and Knowledge Discovery,1998,2(3):283-304
    [17] Likas A,Vlassis N,Verbeek J J.The global k-means clustering algorithm[J].Pattern Recognition,2003,36(2):451-461
    [18] Park H S,Jun C H.A simple and fast algorithm for K-medoidsclustering[J].Expert Systems with Applications,2009,36(2):3336-3341
    [19] 谢娟英,郭文娟,谢维信.基于邻域的K中心点聚类算法[J].陕西师范大学学报(自然科学版),2012,40(4):16-22XIE Juanying,GUO Wenjuan,XIE Weixin.A neighborhood-based K-medoids clustering algorithm[J].Journal of Shaanxi Normal University(Natural Science Edition),2012,40(4):16-22
    [20] 王小乐,刘青宝,陆昌辉,等.一种最小生成树聚类算法[J].小型微型计算机系统,2009,30(5):877-882WANG Xiaole,LIU Qingbao,LU Changhui,et al.Minimum spanning tree clustering algorithm[J].Journal of Chinese Computer Systems,2009,30(5):877-882
    [21] Rodriguez A,Laio A.Clustering by fast search and find of density peaks[J].Science,2014,344(6191):1492-1496
    [22] 谢娟英,高红超,谢维信.K近邻优化的密度峰值快速搜索聚类算法[J].中国科学:信息科学,2016,46(2):258-280XIE Juanying,GAO Hongchao,XIE Weixin.K-nearest neighbors optimized clustering algorithm by fastsearch and finding the density peaks of adataset[J].Science China:Information Science,2016,46(2):258-280
    [23] 张宜,谢娟英,李静,等.红斑鳞状皮肤病的聚类分析[J].济南大学学报(自然科学版),2017,31(3):181-187ZHANG Yi,XIE Juanying,LI Jing,et al.Clustering analysis for erythemato-squamous diseases[J].Journal of University of Jinan(Science and Technology),2017,31(3):181-187
    [24] 谢娟英,周颖,王明钊,等.聚类有效性评价新指标[J].智能系统学报,2017,12(6):873-882XIE Juanying,ZHOU Ying,WANG Mingzhao,et al.New criteria for evaluating the validity of clustering[J].CAAI Transactions on Intelligent Systems,2017,12(6):873-882
    [25] Hubert L,Arabie P.Comparing partitions[J].Journal of Classification,1985,2(1):193-218
    [26] 杨燕,靳蕃,Kamel M.聚类有效性评价综述[J].计算机应用研究,2008,41(6):1631-1632YANG Yan,JIN Fan,Kamel M.Survey of clustering validity evaluation[J].Application Research of Computer,2008,41(6):1631-1632
    [27] 于剑,程乾生.模糊聚类方法中的最佳聚类数的搜索范围[J].中国科学:E辑,2002,32(2):274-280YU Jian,CHENG Qiansheng.The search range of optimal cluster number in fuzzy clustering methods[J].Science in China:Series E,2002,32(2):274-280
    [28] 谢娟英,郭文娟,谢维信,等.基于样本空间分布密度的初始聚类中心优化K-均值算法[J].计算机应用研究,2012,29(3):888-892XIE Juanying,GUO Wenjuan,XIE Weixin,et al.K-means clustering algorithm based on optimal initial centers related to pattern distribution of samples in space[J].Application Research of Computers,2012,29(3):888-892
    [29] Vinh N X,Epps J,Nailey J.Information theoretic measures for clustering comparison:is acorrection for chance necessary[M].New York:ACM Press,2009:1073-1080
    [30] Han J W,Kamber M.数据挖掘概念与技术[M].范明,孟小峰.译.北京:机械工业出版社,2001Han J W,Kamber M.Data mining:concepts and techniques[M].Translated by FAN Ming,MENG Xiaofeng.Beijing:Machinery Industry Press,2001
    [31] Grzymala-BusseJ W,Hu M.A Comparison of several approaches to missing attribute values in data mining[M]//Grzymala-Busse J W,Hu M.eds.Rough Sets and Current Trends in Computing.Berlin,Heidelberg:Springer Berlin Heidelberg,2001:378-385.DOI:10.1007/3-540-45554-x-46
    [32] 乔珠峰,田凤占,黄厚宽,等.缺失数据处理方法的比较研究[J].计算机研究与发展,2006(增刊l):171-175QIAO Zhufeng,TIAN Fengzhan,HUANG Houkuan,et al.A comparasionstudy of missing value datasets processing methods[J].Journalof Computer Research and Development,2006(supl):171-175
    [33] 方洪鹰.数据挖掘中数据预处理的方法研究[D].重庆:西南大学,2009FANG Hongying.Data processing method of dimensionless[D].Chongqing:Southwest University,2009

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700