用户名: 密码: 验证码:
面向行为大数据的数据挖掘技术
详细信息    查看官网全文
摘要
互联网时代下,社交网络的兴起和便携式移动设备的出现为实现对线上线下行为表现的实时记录提供了可能。通过自动化的数据获取手段,我们能够得到实时、海量、多态、真实的行为数据,这是研究者观察和预测个体或群体心理行为特征与规律的宝贵资源。借助数据挖掘技术和机器学习算法,能够有效地发挥隐含在大数据中的关联性价值,更加全面、客观、高效地从数据中挖掘出内隐的心理特征与行为规律,拓宽和加深心理学研究的广度和深度。本文从计算机学科角度,主要介绍基于大数据进行心理学研究中所可能涉及到的一些数据挖掘技术,包括数据挖掘的基本流程以及典型的计算建模方法。特征提取和模型构建是数据挖掘的两个重要步骤。在特征提取部分,我们提出基于深度学习算法构建无监督的特征学习模型,实现从复杂、多态的行为数据中高效地提取出能够客观、准确、全面表征行为规律及特点的数字化向量。在建模部分,线性回归模型、SVM模型、聚类模型等都可被用于心理特征预测,并已取得较好的结果。本文将从概念、原理、应用三方面介绍分类和预测的一些典型算法。此外,报告中将介绍一个公开的数据挖掘工作平台WEKA,它集合了大量能够承担数据挖掘任务的机器学习算法,包括数据预处理、分类、回归、聚类等常用算法。清晰简单的交互式界面为心理学专业背景的研究者提供了想要利用大数据或数据挖掘算法进行相关分析及研究的便利。
In the Internet age, the popularity of social network and the emergence of portable mobile devices provide favorable condition for real-time recording online behavior and offline behavior. By automatic means of data capture, we could obtain real-time, massive, polymorphism and authentic data, which is valuable for researchers to observe and predict the characteristics and rules of individual or group psychological behavior. With the help of data mining technology, we could take advantage of big data, mining the implicit psychological characteristics and behavior patterns more comprehensively, objectively and efficiently. For psychological research, big data and data mining technology contribute to broaden its breadth and deepen its depth. This report mainly introduces some data mining technology which could be utilized in psychological research based on big data, including the basic process of data mining and some typical methods of computational modeling. Feature extraction and modeling are two important procedures in data mining. In the part of feature extraction, we intent to construct an unsupervised feature learning model based on deep learning algorithm, which could extract objective, accurate and comprehensive behavior presentation vector from complex and polymorphism behavior data. In the part of modeling, both of classification and prediction have been used in psychological character prediction, such as linear regression, SVM and clustering, and show good performance. These typical algorithms would be introduced from different aspects, including concept, principle and application. In addition, this report will introduce an open data mining platform WEKA. It contains many commonly used algorithms of data preprocessing, classification, regression, clustering and so on. Together with clear and simple graphical user interfaces for easy access to these functions, for psychological researchers, WEKA provides a convenient method when conducting study based on big data and data mining algorithms.
引文

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700