移动互联网业务性能分析

英文题名：Performance Analysis of the Mobile Internet Business
作者：彭玉静
论文级别：硕士
学科专业名称：计算机技术（专业学位）
中文关键词：业务性能 ; k-means ; 数据挖掘 ; 移动互联网
英文关键词：business performance ; k-means ; data mining ; mobile Internet
学位年度：2013
导师：赵晶玲
学科代码：0852
学位授予单位：北京邮电大学
论文提交日期：2012-12-29

摘要

移动互联网技术的日趋成熟,应用种类的不断丰富,用户规模的快速增长,加速了数据量的增长。在这些海量的数据中,隐藏着大量与业务性能相关的信息,通过这些信息能够分析出各种业务性能之间的差异情况。在这个分析的过程中,数据挖掘技术扮演了非常重要的角色。利用数据挖掘中的相关技术,能够从海量数据中挖掘出有效的性能特征信息,并对具有不同性能特征的业务进行归类分析,以实现对移动互联网业务性能的分析,从而有利于平衡网络的负荷和提高网络资源的使用率。
本文设计了一个用于研究移动互联网业务性能的分析系统。该分析系统包括三个模块：数据信息采集模块,数据预处理模块和数据挖掘模块。数据信息采集模块利用JPCAP中间件技术和数据库存储技术,对业务中包含的有效信息数据进行识别、选择、采集和存储等操作。数据预处理模块包含对存储的业务数据的清洗、集成和变换等过程。数据挖掘模块,是利用数据挖掘中的k-means聚类分析算法对业务的网络性能属性进行聚类分析操作。传统的k-means聚类分析算法是采用随机的方法确定初始质心,这种确定方法会直接导致聚类的结果存在误差且带有随机性。因此本文对算法中初始质心的确定方法进行了改进,实验分析验证该改进后的算法是可行的并且聚类分析的结果是有效的。最后,本文将该改进后的算法应用到了分析系统中的数据挖掘模块中。
The mobile Internet technology increasingly mature, the rapid growth of application types and constantly enrich the user scale, and accelerate the growth of the amount of data. A large number of hidden information associated with business performance in vast amounts of data, which can be by analysis of the difference in a variety of business performance. Data mining technology plays a very important role in the process of analysis. Using data mining technology, extract the information with effective performance characteristics from massive data, classify and analyze the business with different performance characteristics at the same time, achieving the performance analysis of the mobile internet business, which is conducive to the balanced network load and improve network resource utilization.
In this paper, a system used to study the performance of the mobile Internet business analysis is designed. The analysis system consists of three modules:data acquisition module, data preprocessing module and data mining modules. Handle the effective information with identification, selection, collection and storage in data information acquisition module using JPCAP middleware technology and database storage technology to handle. The data preprocessing module contains processes on the effective information data including purification, integration and transformation. Data mining module, using the k-means clustering analysis in data mining algorithm to cluster analysis of operating performance attributes of the business. The traditional k-means clustering analysis algorithm uses the randomized method to determine the initial centers, so it will cause the results of clustering error and randomness. Therefore the method of determining the initial centers is improved, and verified by experiments that the improved algorithm is feasible and the results of cluster analysis are valid. Finally, the improved algorithm is applied to data mining analysis system module.

引文

[1]周东华.数据挖掘中聚类分析的研究与应用[D].天津大学.2006
    [2]叶苏南,彭宏,覃姜维.基于MVC架构的数据挖掘平台的设计与实现[J].计算机工程与设计.2010(05)
    [3]张劲松.金融企业数据挖掘技术应用之探索[J].浙江金融.2005(4)
    [4]徐勇.基于概念格模型的分布式关联规则挖掘研究[D].合肥工业大学.2006
    [5]刘志勇.高亮度LED芯片制造工艺知识挖掘技术的研究与应用[D].广东工业大学.2005
    [6]张俊泽.数据挖掘在石油行业资金管理中的应用[D].天津大学.2007
    [7]徐雪琪.基于统计视角的数据挖掘研究[D].浙江工商大学.2007
    [8]The CRISP-DM process model (1999), http://www.crisp-dm.org/
    [9]强瑛.数据挖掘知识浅述[J].商情.2011(18)
    [10]范洁.数据挖掘中孤立点检测算法的研究[D].中南大学.2009
    [11]贾春晓.基于复杂网络的推荐算法和合作[D].中国科学技术大学.2011
    [12]田浩.基于PageRank值的文本相似度改进模型[D].湖北工业大学.2010
    [13]董箫笛.家电零售客户特征分析及销售策略研究--以A公司为例[D].复旦大学.2009
    [14]伊宏.数据挖掘技术概述[J].中国标准导报.2008(03)
    [15]杨彦侃.并行聚类算法的研究与实现[D].内蒙古科技大学.2010
    [16]洪毅.支持金融产品交叉营销的数据挖掘研究[D].浙江工业大学.2010
    [17]刘明魁.几种机器学习算法的改进及其在中药成分分析中的应用[D].浙江大学.2012
    [18]周俊临.基于数据挖掘的分布式异常检测[D].电子科技大学.2010
    [19]李相林,金中会,刘丹.浅析客户关系管理中数据挖掘流程[J].企业导报.2012(07)
    [20]杨冰雨.黑河地区森林火灾发生规律的研究[D].东北林业大学.2010
    [21]李俊受.中国和韩国移动互联网发展的比较研究[D].北京邮电大学.2010
    [22]韩海光.GSM-R/GPRS技术在铁路通信中的应用[J].上海铁道科技.2012(2)
    [23]蒋媛.智能手机在Symbian OS S60平台下的应用开发[D].成都理工大学.2009
    [24]廖卓.基于Internet的综合性船期数据采集与整理系统的设计与实现[D].北京邮电大学.2010
    [25]任唯贤.即时消息用户行为和网络特征的分析[D].北京交通大学.2006
    [26]安全焦点ISNO. QQ登录过程底层分析[J].黑客防线.2006(03)
    [27]张美璟.基于Java的网络嗅探器探讨.内江科技.2009(04)
    [28]何文金,顾昊旻,李志,胡传胜.模糊聚类在信息化厂商质量评价中的应用研究[J].计算机与现代化.2012(06)
    [29]宋英.数据挖掘技术中聚类算法的研究[J].科学咨询.2010(15)
    [30]覃拥军,刘先锋.数据挖掘中的聚类分析研究[J].科技咨询导报.2007(16)
    [31]李卫平.对k-means聚类算法的改进研究[J].中国西部科技.2010,9(24)
    [32]无名氏.数据业务端到端性能评估与优化[J].通讯世界.2008(07)
    [33]申华.基于数据挖掘的个人信用评分模型开发[D].厦门大学.2009
    [34]李华锋,吴友蓉.数据挖掘中的预处理技术研究[J].成都纺织高等专科学校学报.2010(02)
    [35]曹艺潇.论科技传播新技术——数据挖掘[J].武汉科学大学学报.2007(01)

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700