基于大数据分析的增强型网络文档分类模型
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Enhanced Internet document classification model based on big data analysis
  • 作者:孙护军
  • 英文作者:SUN Hu-jun;School of Electronic Engineering,Xi'an Aeronautical University;
  • 关键词:网络文档 ; 数据挖掘 ; 大数据 ; 分类 ; 模糊规则
  • 英文关键词:Internet document;;data mining;;big data;;classification;;fuzzy rules
  • 中文刊名:SJSJ
  • 英文刊名:Computer Engineering and Design
  • 机构:西安航空学院电子工程学院;
  • 出版日期:2019-03-16
  • 出版单位:计算机工程与设计
  • 年:2019
  • 期:v.40;No.387
  • 基金:陕西省教育厅专项科研计划基金项目(17JK0398)
  • 语种:中文;
  • 页:SJSJ201903027
  • 页数:7
  • CN:03
  • ISSN:11-1775/TP
  • 分类号:162-168
摘要
针对海量网络文档涵盖着广泛的主题和类别,需要使用大数据技术提取有用信息的问题,使用文本挖掘技术和进化模糊算法,基于模糊规则的分类器,提出一种增强型网络文档分类模型,将网络文档归到不同类别(领域)中,进化模糊算法可依据文档内容的变化实现文档分类的动态实时更新。通过和其它经典分类算法对比,验证了该分类算法能够取得较好的效果。
        Due to the problem of large number of Internet documents which include a broad range of topics and categories,which need to use big data processing technology to extract the useful information,an enhanced Internet document classification model was put forward that can classify Internet document to different categories(domain)based on the classifier of fuzzy rules using text mining technology and evolutionary fuzzy algorithms.Among them,evolutionary fuzzy algorithms realized dynamic real-time updates of the document classification on the basis of the change of content.The proposed algorithm shows better effects through comparison with other classical classification algorithms.
引文
[1]Wen Aihong.Multi-classification cluster analysis of large data based on knowledge element in microblogging short text[J].Cluster Computing,2018,1:1-9.
    [2]Andrei M.From image to text classification:A novel approach based on clustering word embeddings[J].Procedia Computer Science,2017,6(112):1783-1792.
    [3]Kabadjov M,Steinberger J,Steinberger R.Multilingual statistical news summarization[C]//Proc of Multi-Source,Multilingual Information Extraction and Summarization,Theory and Applications of Natural Language Processing.Springer Berlin Heidelberg,2013:229-252.
    [4]Kim D,M Jo,Hwang E.SNS-based issue detection and related news summarization scheme[C]//Proc of the 8th International Conference on Ubiquitous Information Management and Communication.ACM,2014:1-7.
    [5]Makki R.Twitter message recommendation based on user interest profiles[C]//Proc of IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.IEEE,2016:406-410.
    [6]Makki R.Context-specific sentiment lexicon expansion via minimal user interaction[C]//Proc of 5th International Conference on Information Visualization Theory and Applications.IEEE,2014:178-186.
    [7]Kim D,Hwang E,Rho S.Twitter trends:A spatio-temporal trend detection and related keywords recommendation scheme[J].Multimedia Syst,2014,21(1):73-86.
    [8]Yu Jiangsheng,Chen Xuewen.Latent topic-semantic indexing based automatic text summarization[C]//Proc of 15th IEEEInternational Conference on Machine Learning and Applications.IEEE,2016:120-126.
    [9]LIU Weidong,LUO Xiangfeng,ZHANG Jun.Semantic summary automatic generation in news event[J].Concurrency and Computation:Practice and Experience,2017,10(29):41-45.
    [10]Malhotra S,Dixit A.An effective approach for news article summarization[J].Int J Comput Appl,2013,76(16):5-10.
    [11]Chowdhury SG,Routh S,Chakrabarti S.News analytics and sentiment analysis to predict stock price trends[J].Int JComput Sci Inform Technol,2014,5(3):3595-3604.
    [12]Gambhir M.Recent automatic text summarization techniques:A survey[J].Artificial Intelligence Review,2017,47(1):1-66.
    [13]Yang Wu.News recommendation method by fusion of contentbased recommendation and collaborative filtering[J].Journal of Computer Applications,2016,36(2):414-418.
    [14]Francisci Morales G De,Gionis A,Lucchese C.From chatter to headlines:Harnessing the real-time web for personalized news recommendation[C]//Proc of the Fifth ACM International Conference on Web Search and Data Mining.ACM,2012:153-162.
    [15]Huang Taiwen.Multilingual multi-document summarization with enhanced hLDA features[C]//Chinese Computational Linguistics and Natural Language Processing based on Naturally Annotated Big Data,2016:299-312.
    [16]Ranjitha NS.Abstractive multi-document summarization[C]//Proc of International Conference on Advances in Computing,Communications and Informatics.IEEE,2017:1690-1694.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700