大数据背景下粗糙集属性约简研究进展
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Research Progress of Attribute Reduction Based on Rough Set in Context of Big Data
  • 作者:邬阳阳 ; 汤建国
  • 英文作者:WU Yangyang;TANG Jianguo;School of Computer Science and Engineering, Xinjiang University of Finance and Economics;
  • 关键词:大数据 ; 粗糙集 ; 属性约简 ; 并行计算 ; 增量学习 ; 粒计算
  • 英文关键词:big data;;rough set;;attribute reduction;;parallel computing;;incremental learning;;granular computing
  • 中文刊名:JSGG
  • 英文刊名:Computer Engineering and Applications
  • 机构:新疆财经大学计算机科学与工程学院;
  • 出版日期:2019-03-15
  • 出版单位:计算机工程与应用
  • 年:2019
  • 期:v.55;No.925
  • 基金:国家自然科学基金(No.61440047,No.61562079);; 新疆维吾尔自治区人文社科重点研究基地项目(No.050315C01)
  • 语种:中文;
  • 页:JSGG201906006
  • 页数:9
  • CN:06
  • 分类号:37-44+183
摘要
在大数据时代,数据不仅类型多样、结构复杂还具有动态变化的特点,传统的分析工具已经不能满足大数据分析的需求。如何快速有效地从大规模数据中获取有价值的信息成了一个具有挑战性的问题。一些学者将粗糙集属性约简理论与其他理论相结合,从而可以有效地处理高维动态的海量数据。重点对基于并行计算、增量学习、粒计算的属性约简算法进行分类总结,分析了它们各自的特点,剖析了当前研究中存在的问题,展望了未来研究的重点关注方向。
        Classical analysis tools are not able to satisfy this era of big data, which is full of multifarious, complicated and dynamic changed data. How to obtain valuable information from large-scale data quickly and effectively has became a challenging problem. Some scholars combined the rough set attribute reduction theory with other theories to process highdimensional, dynamic and massive data effectively. The attribute reduction algorithms based on parallel computing, incremental learning and granular computing are classified and summarized. Then their characteristics, present problems and the key future research directions are analyzed.
引文
[1]Wang G,Yang J,Xu J.Granular computing:from granularity optimization to multi-granularity joint problem solving[J].Granular Computing,2017,2(3):105-120.
    [2]Pawlak Z,Grzymala-busse J,Slowinski R,et al.Rough sets[J].International Journal of Computer&Information Sciences,1982,11(5):341-356.
    [3]Pawlak Z.Rough set theory and its applications to data analysis[J].Cybernetics&Systems,1998,29(7):661-688.
    [4]Pawlak Z,Skowron A.Rudiments of rough sets[J].Information Sciences,2007,177(1):3-27.
    [5]Pawlak Z,Skowron A.Rough sets and Boolean reasoning[J].Information Sciences,2007,177(1):41-73.
    [6]Wang G,Skowron A,Yao Y,et al.Thriving rough sets[M].[S.l.]:Springer International Publishing,2017:87-118.
    [7]Qian Y,Liang X,Wang Q,et al.Local rough set:a solution to rough data analysis in big data[J].International Journal of Approximate Reasoning,2018,97:38-63.
    [8]Hu Q,Zhang L,Zhou Y,et al.Large-scale multimodality attribute reduction with multi-kernel fuzzy rough sets[J].IEEE Transactions on Fuzzy Systems,2018,26(1):226-238.
    [9]Liu C,Pedrycz W,Qian J,et al.Covering-based multigranulation decision-theoretic rough set approaches with new strategies[J].Journal of Intelligent&Fuzzy Systems,2018(1):1-13.
    [10]Das A K,Sengupta S,Bhattacharyya S.A group incremental feature selection for classification using rough set theory based genetic algorithm[J].Applied Soft Computing,2018,65:400-411.
    [11]El Aziz M A,Hassanien A E.Modified cuckoo search algorithm with rough sets for feature selection[J].Neural Computing and Applications,2018,29(4):925-934.
    [12]Dai J,Hu Q,Hu H,et al.Neighbor inconsistent pair selection for attribute reduction by rough set approach[J].IEEETransactions on Fuzzy Systems,2018,26(2):937-950.
    [13]Cai J,Luo J,Wang S,et al.Feature selection in machine learning:a new perspective[J].Neurocomputing,2018,300:70-79.
    [14]Anaraki J R,Eftekhari M.Rough set based feature selection:a review[C]//5th Conference on Information and Knowledge Technology,2013:301-306.
    [15]刘少辉,盛秋戬,吴斌,等.Rough集高效算法的研究[J].计算机学报,2003(5):524-529.
    [16]徐章艳,刘作鹏,杨炳儒,等.一个复杂度为max(O(|C||U|),O(|C|~2|U/C|))的快速属性约简算法[J].计算机学报,2006(3):391-399.
    [17]钱宇华,梁吉业,王锋.面向非完备决策表的正向近似特征选择加速算法[J].计算机学报,2011,34(3):3435-3442.
    [18]格兰马.并行计算导论[M].2版.张武,毛国勇,程海英,等译.北京:机械工业出版社,2005:63-69.
    [19]周志华.机器学习[M].北京:清华大学出版社,2016:2-4.
    [20]Zadeh L A.Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic[J].Fuzzy Sets and Systems,1997,90(2):111-127.
    [21]Lin T Y.Granular computing:from rough sets and neighborhood systems to information granulation and computing in words[C]//European Congress on Intelligent Techniques and Soft Computing,1997:1602-1606.
    [22]王国胤,张清华,胡军.粒计算研究综述[J].智能系统学报,2007(6):8-26.
    [23]Yao Y.Granular computing for data mining[C]//Proceedings of SPIE Conference on Data Mining,Intrusion Detection,Information Assurance,and Data Networks Security,2006:1-12.
    [24]杨燕燕.基于粗糙集的增量属性约简机理与算法研究[D].北京:华北电力大学,2017.
    [25]Hu X,Cercone N.Learning in relational databases:a rough set approach[J].Computational Intelligence,1995,11(2):323-338.
    [26]王国胤,杨大春.基于条件信息熵的决策表约简[J].计算机学报,2002,25(7):759-766.
    [27]Skowron A,Rauszer C.The discernibility matrices and functions in information systems[M]//Intelligent decision support.Dordrecht:Springer,1992:331-362.
    [28]Tsai C,Lin W,Ke S,et al.Big data mining with parallel computing:a comparison of distributed and MapReduce methodologies[J].Journal of Systems and Software,2016:83-92.
    [29]Deng D.Parallel reduct and its properties[C]//IEEE International Conference on Granular Computing,2009:121-125.
    [30]肖大伟,王国胤,胡峰.一种基于粗糙集理论的快速并行属性约简算法[J].计算机科学,2009,36(3):208-211.
    [31]Raza M S,Qamar U.A parallel rough set based dependency calculation method for efficient feature selection[J].Applied Soft Computing,2018,71:1020-1034.
    [32]Varma P R K,Kumari V V,Kumar S S.A novel rough set attribute reduction based on ant colony optimisation[J].International Journal of Intelligent Systems Technologies and Applications,2015,14(3/4):330-353.
    [33]张贵红,李中华.基于粗糙集的海量数据挖掘算法研究[J].现代电子技术,2016,39(17):116-119.
    [34]Chen H,Li T,Cai Y,et al.Parallel attribute reduction in dominance-based neighborhood rough set[J].Information Sciences,2016,373:351-368.
    [35]Liang B,Zheng S,Wang L,et al.The attribute reduction algorithm based on parallel computing[J].Journal of Intelligent and Fuzzy Systems,2017,32(3):1867-1875.
    [36]Deng D,Chen L,Yan D,et al.Parallel reducts for incremental data[C]//2012 IEEE International Conference on Granular Computing,2012:84-88.
    [37]钱进,苗夺谦,张泽华.云计算环境下知识约简算法[J].计算机学报,2011,34(12):2332-2343.
    [38]Qian J,Miao D,Zhang Z,et al.Parallel attribute reduction algorithms using Map Reduce[J].Information Sciences,2014,279:671-690.
    [39]杨勇,朱影.一种基于MapReduce的粗糙集并行属性约简算法[J].重庆邮电大学学报(自然科学版),2015,27(1):89-96.
    [40]Czolombitko M,Stepaniuk J.Attribute reduction based on MapReduce model and discernibility measure[C]//IFIP International Conference on Computer Information Systems and Industrial Management,2016:55-66.
    [41]Chowdhury T,Chakraborty S,Setua S K.Knowledge extraction from big data using MapReduce-based Parallel-Reduct algorithm[C]//International Conference on Computer Science and Network Technology,2017:240-246.
    [42]Chen M,Yuan J,Li L,et al.A fast heuristic attribute reduction algorithm using Spark[C]//2017 IEEE 37th International Conference on Distributed Computing Systems,2017:2393-2398.
    [43]吴信东,嵇圣硙.MapReduce与Spark用于大数据分析之比较[J].软件学报,2018,29(6):1770-1791.
    [44]Dagdia Z C,Zarges C,Beck G,et al.A distributed rough set theory based algorithm for an efficient big data pre-processing under the spark framework[C]//IEEE International Conference on Big Data,2018:911-916.
    [45]Singh P K,Prasad P.Scalable quick reduct algorithm:iterative Map Reduce approach[C]//Proceedings of the 3rd IKDD Conference on Data Science,2016:1-2.
    [46]刘宗田.属性最小约简的增量式算法[J].电子学报,1999(11):97-99.
    [47]Hu F,Wang G,Huang H,et al.Incremental attribute reduction based on elementary sets[C]//International Workshop on Rough Sets,Fuzzy Sets,Data Mining,and Granular-Soft Computing.Berlin,Heidelberg:Springer,2005:185-193.
    [48]杨明.一种基于改进差别矩阵的属性约简增量式更新算法[J].计算机学报,2007(5):5815-5822.
    [49]胡峰,代劲,王国胤.一种决策表增量属性约简算法[J].控制与决策,2007(3):268-272.
    [50]Shu W,Qian W.An incremental approach to attribute reduction from dynamic incomplete decision systems in rough set theory[J].Data&Knowledge Engineering,2015,100:116-132.
    [51]刘涛涛,马福民,张腾飞.基于正区域和差别元素的增量式属性约简算法[J].计算机工程,2016,42(8):183-187.
    [52]钱文彬,杨炳儒,徐章艳,等.基于信息熵的核属性增量式高效更新算法[J].模式识别与人工智能,2013,26(1):42-49.
    [53]Liang J,Wang F,Dang C,et al.A group incremental approach to feature selection applying rough set technique[J].IEEE Transactions on Knowledge and Data Engineering,2014,26(2):294-308.
    [54]丁棉卫,张腾飞,马福民.基于二进制区分矩阵的增量式属性约简算法[J].计算机工程,2017,43(1):201-206.
    [55]Wei W,Wu X,Liang J,et al.Discernibility matrix based incremental attribute reduction for dynamic data[J].Knowledge-Based Systems,2018,140:142-157.
    [56]Yang Y,Chen D,Wang H.Active sample selection based incremental algorithm for attribute reduction with rough sets[J].IEEE Transactions on Fuzzy Systems,2017,25(4):825-838.
    [57]Shu W,Shen H.Updating attribute reduction in incomplete decision systems with the variation of attribute set[J].International Journal of Approximate Reasoning,2014,55(3):867-884.
    [58]Wang F,Liang J,Qian Y,et al.Attribute reduction:a dimension incremental strategy[J].Knowledge Based Systems,2013,39:95-108.
    [59]Jing Y,Li T.A matrix-based incremental attribute reduction approach under knowledge granularity on the variation of attribute set[C]//International Conference on Intelligent Systems and Knowledge Engineering,2016:34-39.
    [60]Shu W,Shen H.Incremental feature selection based on rough set in dynamic incomplete data[J].Pattern Recognition,2014,47(12):3890-3906.
    [61]Xie X,Qin X.A novel incremental attribute reduction approach for dynamic incomplete decision systems[J].International Journal of Approximate Reasoning,2018,93:443-462.
    [62]李丹.属性值细化的矩阵增量约简算法[J].计算机工程与应用,2017,53(21):68-71.
    [63]胡峰,黄海,王国胤,等.不完备信息系统的粒计算方法[J].小型微型计算机系统,2005,26(8):1335-1339.
    [64]赵敏,罗可,秦哲.基于粒计算的属性约简算法[J].计算机工程与应用,2008,44(30):157-159.
    [65]徐久成,史进玲,孙林.一种基于相对粒度的决策表约简算法[J].计算机科学,2009,36(3):205-207.
    [66]Jing Y,Li T,Luo C,et al.An incremental approach for attribute reduction based on knowledge granularity[J].Knowledge-Based Systems,2016,104:24-38.
    [67]郑诚,王波,洪彤彤.关系矩阵的知识粒度增量式属性约简[J].小型微型计算机系统,2018,39(5):1000-1004.
    [68]Hu J,Pedrycz W,Wang G,et al.Rough sets in distributed decision information systems[J].Knowledge-Based Systems,2016,94:13-22.
    [69]冀素琴,石洪波,吕亚丽.基于粒计算与区分能力的属性约简算法[J].模式识别与人工智能,2015,28(4):327-334.
    [70]阎红灿,张奉,刘保相.基于粒计算的粗决策规则抽取与约简[J].通信学报,2016,37(S1):30-35.
    [71]Wang Y,Xie J.Granular computing combined with support vector machines for diagnosing Erythemato-Squamous diseases[C]//International Conference on Health Information Science.Cham:Springer,2017:56-68.
    [72]Qian Y,Liang J.Rough set method based on multigranulations[C]//5th IEEE International Conference on Cognitive Informatics,2006:297-304.
    [73]Qian Y,Liang J,Dang C.MGRS in incomplete information systems[C]//IEEE International Conference on Granular Computing,2007:163-168.
    [74]Nakasima-López S,Sanchez M A,Castro J R.Big data and computational intelligence:background,trends,challenges,and opportunities[M]//Computer science and engineering-theory and applications.Cham:Springer,2018:183-196.
    [75]Qian Y,Liang J,Yao Y,et al.MGRS:a multi-granulation rough set[J].Information Sciences,2010,180(6):949-970.
    [76]Liang J,Wang F,Dang C,et al.An efficient rough feature selection algorithm with a multi-granulation view[J].International Journal of Approximate Reasoning,2012,53(6):912-926.
    [77]Wei W,Liang J.Information fusion in rough set theory:an overview[J].Information Fusion,2019,48:107-118.
    [78]胡善忠,徐怡,何明慧,等.多粒度粗糙集粒度约简的高效算法[J].计算机应用,2017,37(12):3391-3396.
    [79]Liang S,Liu K,Chen X,et al.Multi-granularity attribute reduction[C]//International Joint Conference on Rough Sets.Cham:Springer,2018:61-72.
    [80]Jing Y,Li T,Fujita H,et al.An incremental attribute reduction approach based on knowledge granularity with a multi-granulation view[J].Information Sciences,2017,411:23-38.
    [81]Dong H,Li T,Ding R,et al.A novel hybrid genetic algorithm with granular information for feature selection and optimization[J].Applied Soft Computing,2018,65:33-46.
    [82]Huang Y,Li T,Luo C,et al.Matrix-based dynamic updating rough fuzzy approximations for data mining[J].Knowledge-Based Systems,2017,119:273-283.
    [83]Jing Y,Li T,Fujita H,et al.An incremental attribute reduction method for dynamic data mining[J].Information Sciences,2018,465:202-218.
    [84]Fatima M,Pasha M.Survey of machine learning algorithms for disease diagnostic[J].Journal of Intelligent Learning Systems and Applications,2017,9(1):1-16.
    [85]Niu J,Huang C,Li J,et al.Parallel computing techniques for concept-cognitive learning based on granular computing[J].International Journal of Machine Learning&Cybernetics,2018,9(3):1-21.
    [86]Raman M R G,Kirthivasan K,Sriram V S S.Development of rough set-hypergraph technique for key feature identification in intrusion detection systems[J].Computers&Electrical Engineering,2017,59:189-200.
    [87]Lang G,Miao D,Cai M.Three-way decision approaches to conflict analysis using decision-theoretic rough set theory[J].Information Sciences,2017,406:185-207.
    [88]Wong S K M,Ziarko W.Optimal decision rules in decision on table[J].Bulletin of Polish Academy of Sciences,1985,33(11/12):693-696.
    [89]Jing S,Li G,Zeng K,et al.Efficient parallel algorithm for computing rough set approximation on GPU[J].Soft Computing,2018,22(22):7553-7569.
    [90]Cuomo S,Galletti A,Marcellino L,et al.On GPU-CU-DA as preprocessing of fuzzy-rough data reduction by means of singular value decomposition[J].Soft Computing,2018,22(5):1525-1532.
    [91]Yang C,Ge H,Li L,et al.A unified incremental reduction with the variations of the object for decision tables[J].Soft Computing,2018:1-21.
    [92]Lv P,Qian J,Yue X.Incremental attribute reduction algorithm for big data using MapReduce[J].Journal of Computational Methods in Sciences and Engineering,2016,16(3):641-652.
    [93]徐计,王国胤,于洪.基于粒计算的大数据处理[J].计算机学报,2015,38(8):1497-1517.
NGLC 2004-2010.National Geological Library of China All Rights Reserved.
Add:29 Xueyuan Rd,Haidian District,Beijing,PRC. Mail Add: 8324 mailbox 100083
For exchange or info please contact us via email.