详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
数据挖掘(Data Mining,DM)是一种新兴的数据分析方法,它可以帮助人们充分应用数据中所蕴涵的信息,成为人工智能研究中非常活跃的领域。粗糙集是一种处理模糊和不确定知识的理论,聚类分析在没有先验知识时发现数据的规律,为人们提供了新的数据分类的方法。虽然在粗糙集和聚类分析方面有很多理论和方法产生,但由于数据对象的千变万化,需要我们不断对这些技术进行完善,以满足应用的需要。
Data mining is an innovated method of data analysis. It can help people maximize the useful information included in tremendous data, which has become active in artificial intelligence field. Rough set theory is a theory adopted to deal with rough and uncertain knowledge, which analyzes the clusters and finds the data principles when previous knowledge is not available, providing a new method for data classification. Although there are numerous methods of rough set and cluster analysis, as the data objects is changing continuously, we have to improve these relevant technologies over time, and propose creative theory in response, meeting the demands of application.
     This paper proposes the basic conceptions and attributes of relevant influences in rough set and study the interactions between different attributes, presenting a attributes reduction algorithm based on relevant influences. Through the matrix of relevant influences of attributes, making the relevant influences of attributes as inspiring prerequisite, we effectively delete redundant attributes to gain the reduced sets which reflects the interaction of different attributes. As proved by experiments, the algorithm could obtain the reduction sets composed of attributes with high relevant influences. This conception expands the application range of rough set, presenting a new method for data mining.
     Based on the conceptions of relevant influences of rough set, we study dynamic reduction conceptions and methods on the basis of relevant influenced attributes, and calculate the effects of activation stateρ(U) and dormancy stateσ(U) in rough set samples on the attributes reduction sets, whenρ(U)→σ(U), reduce the redundant attributes from reduction sets, whileσ(U)→ρ(U), add the indispensible attributes to the reduction sets, enabling the exchanges of event states be described more effectively by rough sets. This method compensates the deficiency of the previous methods that rough set could only describe static objects.
     As intelligence supervising system is the core of industrial automatic control, rough set theory has provided practicable real time decision principles, deducting the weak real time attributes and retaining strong real time ones, to ensure the real time principle of the decision system. The real time method of attributes reduction proposed by this paper has expanded the application of rough set in real time decision systems.
     As to the requirements of classification of conditional attributes in decision tables, this paper proposes a reduction algorithm on the basis of attributes classification, which first conducts classification calculation on conditional attributes according to classification functions, then deletes the less important subsets, concluding the classified reduction sets of attributes. As experiments have proved, maintaining the original decision ability constant, this algorithm could deduct parts of the attributes effectively and solves the attributes classification problems.
     We have applied our algorithm of attributes classification to the failure diagnosis of electric distribution network and early warning systems of electric interlock network. Studied the attributes choices and rules generation methods of failure diagnosis of electric distribution network, we employed real time attributes reduction and attributes classification reduction to the failure diagnosis systems of electric distribution network. Through calculation of the values of attributes of failure diagnosis and early warning systems of electric interlock network on the basis of electric theory, we have studied the relevant interactions between attributes under the condition of negative charges transforming, obtaining the extents of failures in the network. By the application of reduction algorithm of relevantly influenced attributes, we have observed the changes of attributes and achieved our goal of predicting failures and getting them fixed in time.
     Cluster/classification congruity algorithm, proposed in the situation of inconsistence between classification and cluster, is a method to achieve the results the accordance of cluster and classification respectively after calculates the congruity matrix, and coordinates the results effectively by modifying the matrix continuously, in order to achieve the maximum congruity. In the application of prediction of negative charges, this algorithm has a wide application, which could be used on some occasions when the results of cluster and classification are not consistent.
     These algorithm mainly conducts research on algorithms of attributes reduction and cluster analysis in rough sets, proposing several creative theories and methods, which also have been applied to electric automatic systems as experiments. As have been proved by experiments, these theories and methods are effective and practicable.
[1]J. R. Quinlan.Programs for Machine Learning. Morgan Kaufmann publishers, 1993,23-26
    [2]L. Breiman, J. Frieman, R. Olshen. Classification and Regression Trees. Monterey. CA: Wadsworth Int,1984,32-35
    [3]M. Mehta, R. Agrawal,J. Rissanen.SLIQ:A fast scalable classifier for data mining. In Proc.1996 iritl. on Extending Database Technology(EDBT'96), France,1996,67-78
    [4]D. Heckerman, A. Mamdani,M Wellman. Real-world applications of Bayesian networks. Communications of ACM,1995,43-47
    [5]G. Cooper. Computational complexity of probabilistic inference using Bayesian belief networks. Artificial Intelligence,1990,42:393-405
    [6]L. Saul, T. Jaakkola,M. Jordan. Mean field theory for sigmoid belief networks-Journal Artificial Intelligence Research,1996,4:61-76
    [7]T. Jaakkola,M. Jordan. Computing upper and lower bounds on likelihoods in intractable networks.On Uncertainty in Artificial Intelligence,1996,340-348
    [8]A. Darwiche and G Provan. A practical paradigm for implementing belief-network inference. On Uncertainty in Artificial Intelligence,1996,203-210
    [9]D. E. Rumelhart, G E. Hinton,R. J Williams. Learning Internal Representations by Error Propagation. Parallel Distributed Processing, Cambridge,1986,115-119
    [10]G. To well and J. Shavlik. Refinement of approximately correct domain theories by knowledge-based neural networks.On Artificial Intelligence,1994,861-866
    [11]M. Craven,J. Shavlik. Using sampling and queries to extract rules from trained neural networks. On Machine Learning,1994,31-36
    [12]M. Craven, J. Shavlik. Extracting tree-structured representations of trained networks.
    Advances in Neural information processing system,1995,45-52
    [13]V. N. Vapnik. Statistical learning theory.New York:J.Wiley,1998.51-53
    [15]J Macqueen. Some methods for classification and analysis of multivariate observations.1967,281-297
    [16]L Kaufman, P J Rousseeuw. Finding Groups in Data:An Introduction to Cluster Analysis.New York:John Wiley & Sons.1990,113-117
    [17]R T Ng, J Han. Efficient and efective clustering method for spatial data mining. Very Large Data Bases,1994,144-155
    [18]M Ester, H P Kriegel, J Sander. A density-based algorithm for discovering clusters in large spatial databases.Knowledge Discovery and Data Mining (KDD'96),1996,226-231
    [19]M Ankerst, M Breunig, H P Kriegel. OPTICS:Ordering points to identify the clustering structure. Management of data (SIGMOD'99),1999,49-60
    [20]A Hinneburg, D A Keim. An efficient approach to clustering in large multimedia databases with noise. Knowledge Discovery and Data Mining (KDD'98),1998,58-65
    [21]W Wang, J Yang, R Muntz. A statistical information grid approach to spatial data mining. Very Large Data Bases (VLDB'97),1997,186-195
    [22]R. Agrawal, J. Gehrke, D. Gunopulos. Automatic subspace clustering of highdimensional data for data mining applications. Management of Data (SIGMOD'98),1998,94-105
    [23]G Sheikholeslami, S Chaterjee, A Zhang.A multi-resolution clustering approach for very large spatial database. Very Large Data Bases,1998,152-158
    [24]D Fisher. Improving inherence through conceptual clustering.1987,461-465
    [25]J.Gennar, P.Langley, D.Fisher,Models of incremental concept formation. Artificial
    [26]P.Cheeseman, J.Stutz.Theory and result. Advances in Knowledge Discovery and Data Mining,2000,153-180
    [27]Pizzuti Clara, Talia Domenico. Scalable parallel clustering for mining large data sets. On Knowledge and Data Engineering,2003,15(3):629-641
    [28]D.E Rumelhart, D.Zipser.Feature discovery by competitive learning.Congnitive Science.1985,9(1):75-112
    [29]T Kohonen. Self-organization and associate memory.Berlin:Springer-Verlag, 1984,86-97
    [30]Kohonen T, Improved versions of learning vector quantization.International joint Conference on Networks, San Diego 1990,1:545-550
    [31]Cowgill M.C.Harvey, R.J.Watson.A Genetic Algorithm Approach to Cluster Analysis. Computers&Mathematics with Applications,1999,37 (7):99-108
    [32]Mali U, Bandyopadhyay S.Genetic algorithm-based clustering technique.Patten Recognition,2000,33 (9):1455-1465
    [33]Paolo Corsini, Beatrice Lazzerini, A Fuzzy Relational Clustering Algorithm Based on a Dissimilarity Measure Extracted From Data, IEEE Transactions on Systems, Man,2004,66-78
    [34]Jung-Hsien Chiang, Pei-Yi Hao.A new kernel-based fuzzy clustering approach: support vector clustering with cell growing.IEEE Transactions on Fuzzy Systems, 2003,11(4):518-527
    [35]Bouguettaya A, Le Viet Q. Data clustering analysis in a multidimensional space. Information Sciences,1998,112:1-4
    [36]Strehl A, Ghosh J, Relationship-based clustering and visualization for high-dimensional data mining, INFORMS COMPUT,2003,15(2):208-230
    [37]Hen, Chun-Wei Tsai.An Eficient Clustering Algorithm for Large Databases.IEEE International Conference on Systems,2002,5:110-116
    [38]Daniel B A, Ping Chen.Using Self-Similarity to Cluster Large Data Sets.Data Mining and Knowledge Discovery,2003,7(2):123-152
    [39]Wei Chi-Ping, Lee Yen-Hsien,Hsu Che-Ming.Empirical comparison of fast partitioning-based clustering algorithms for large data sets.Expert Systems with Applications,2003,24(4):351-363
    [40]E A Maharaj.Cluster of Time Series.Journal of Classification.2000,17(2):297-314
    [41]Z. Pawlak, Rough Sets:theoretical aspects of reasoning about data,Netherlands: Kluwer Academic Publishers,1991,93-102
    [42]Chanas S,Kuchta D.Further remarks on the relation between rough and fuzzy sets. Fuzzy Sets and Systems,1992,47:391-394
    [43]Yao Y Y. Relational interpretations of neigh borhood operatorsand rough set approximation. Information Sciences,1998,111:239-259
    [44]Z. Pawlak. Rough set approach to multi-attribute Journal of Operational Research, 1994,72:443-459
    [45]Z.Pawlak. Rough set approach to knowledge-based Journal of Operational Research,1997,99:5-7
    [46]M.Kryszkiewicz. Comparative study of alternative types of knowledge reduction in inconsistent systems. International journal of Intelligence Systems,2001,16:105-120
    [47]J.Y.Liang, D.Y.Li. Uncertainty and Knowledge Acquisition in information systems. Beijing:Science Press,2005,99-112
    [48]W. X. Zhang, J. S. Mi, W. Z. Wu. Approaches to knowledge reductions in inconsistent systems. International Journal of intelligent systems,2003,18:989-1000
    [49]A. Skowron, C. Rauszer. The discemibility matrices and functions in information systems. Intelligent Decision Support:Handbook of Applications and Advances of Rough Sets Theory,Kluwer Academic Publisher,1997,331-362
    [50]A.Skowron. Rough sets and Boolean reasoning.Granular Computing:An Emerging Paradigm,New York:Physica-Vertag,2001,95-124
    [51]A.Skowron, Boolean reasoning for decision rules generation.Proceedings of the Seventh International Symposium ISMIS'93,Lecture Notes in artificial intelligence, Berlin,1993,68:295-305
    [54]D.Slezak,W.Ziarko.The investigation of the bayesian rough set model. International Journal of Approximate Reasoning,2005,40(1-2):81-91
    [55]Q.Liu, F.Jiang, D.Deng.Design and implement for diagnosis sytems of hemorheology on blood viscosity syndrome based gre.In Proceedings of 9th International Conference on Rough Sets.Fuzzy Sets,Data Mining,and Franular Computing, 2003,413-420
    [56]M.E.Yahia,R.Mahmod,N.Sulaiman et al. Rough neural expert systems.Expert Systems with Application,2000,18(2):87-99
    [57]P.Srinivasan,M.E.Ruiz,D.Kraft et al.Vocabulary mining for information retrieval: rough sets and fuzzy sets. Information Processing and Management,2001,37(1): 15-38
    [58]W.Ziarko.Evaluation of probabilistic decision tables.Data Mining,and Granular Computing (RSFDGrC2003),China,2003,189-196
    [59]A.Skowron,P.Synak. in information map.In Proceedings of the 8th International Conference on Rough Sets and Current Trends in Computing(RSCTC2002), 2002,453-460
    [60]A.Skowron,P.Synak.Reasoning based on information changes in information maps.Data Mining,and Granular Computing (RSFDGrC2003), China,2003,229-236
    [61]Y.Y.Yao,Y.Zhao,J.Wang.On reduct construction algorithms.In Proceedings of the First International Conference on Rough Sets and Knowledge Technology (RSKT2006), 2006,297-304
    [62]G.Y.Wang,J.Zhao et al.Theoretical study on attribute reduction of rough set theory.In Proceedings of the Third International Conference on Cognitive Informatics (ICCI2004),2004,148-155
    [64]M.Kryszkiewiez.Comparative studies of alternative type of knowledge reduetion in inconsistent systems.International Journal of Intelligent Systems,2001,16(1): 105-120
    [65]M.Inuiguchi.Strueture-based aproaches to attribute reduetion in variable Pricision rough set models. In Proceedings of IEEE International Conference on Granular Computing 2005(IEEE Grc2005),2005,34-39
    [67]G..Wang.Attribute core of decision table. In Proceedings of the 3th Inernational Conference on RoughSets and Current Trends in Computing (RSCTC2002),2002,213-217
    [68]R.Felix,T.Ushio.Rough sets based machine learning using a binary discernibility matrix.In Proceedings of 2th International Conference on Intelligent Processing and Manufacturing of Materials(IPMM99),1999,299-305
    [69]M.Inulguchi, T.Miyajima.Rough set based rule induction from two decision tables. Accepted by European Journal of Operational Research,2006,102-107
    [71]J X. H. Hu, N. Cercone. Learning in relational databases:a rough set approach. International Journal of Computational Inteligence,1995,11(2):323-337.
    [72]Xu Kali,Zhou Ming,Ren Jianwen.An Object-Oriented Power System Fault Diagnosis Expert System.In:International Conference on Electrical Engineering (ICEE'99),Hong Kong:1999.112-123
    [77]Application integration at electric utilities System interfaces for distribution management Part 1.99-122
    [78]IEC61850 Communication networks and systems in substations Part1.20-34
    [79]Ren Wei,Wang Hong-Li,Qiao Yu.Data Mining in Power Market Based on Multi-agent Technology,Power System Technology,2002,17(2):1837-1841
    [81]Application integration at electric utilities System interfaces for distribution management Part 3,132-143
    [85]刘海涛,王晓龙,苏剑.配调一体配网自动化系统的设计与实现.电网技 术,2006,30:668-672
    [97]Hong-Chan Chin.Fault section diagnosis of power system using Fuzzy logic. Power Systems, IEEE Transactions,2003,18(1):245-250
    [99]S.Guha,R.Rastogi,K.Shim.an efficient clustering algorithm for large database. Information Systems,2001,26(1):35-58
    [100]Pizzuti Clarz,Talia Domenico,Scalable parallel clustering for mining large data sets,IEEE Trans on Knowledge and Data Engineering,2003,15(3):629-641
    [101]Cowgill,M.C.Harvey,R. J.Watson. A Genetic Algorithm Approach to Cluster Analysis,.Computers & Mathematics with Applications,1999,37(7):99-108
    [102]Mali U,Bandyopadhyay S,Genetic algorithm-based clustering technique,Patten Recognition,2000.33(9):1455-1465
    [103]Paolo Corsini,Beatrice Lazzerini,Francesco Marcelloni.A Fuzzy Relational Clustering Algorithm Based on a Dissimilarity Measure Extracted From Data,IEEE Transactions on Systems,2003,332-341
    [106]Hen,Chun-Wei Tsai, An Eficient Clustering Algorithm for Large Databases.IEEE International Conference on Systems,2002,5.110-116
    [107]Daniel B.A,Ping Chen.Using Self-Similarity to Cluster Large Data Sets.Data Mining and Knowledge Discovery,2003,7(2):123-152
    [108]Wei Chi-Ping,Lee Yen-Hsien,Hsu Che-Ming.Empirical comparison of fast partitioning-based clustering algorithms for large data sets.Expert Systems with Applications,2003,24(4):351-363
    [109]E.A.Maharaj.Cluster of Time Series,Journal of Classification.2000,17(2):297-315
    [110]Guedalia I.D.London M,Werman M,An on-line agglomerative clustering method for non-stationary data.Neural Computation,1999,11(2):128-135
    [112]Tianming Hu,Sam Y Sung.Detecting pattern-based outliers.Pattern Recognition Letters,2003,24:3059-3068
    [113]R.J.Pell.Multiple outlier detection for multivariate calibration using robust statistical techniques. Chemometrics and Intelligent Laboratory Systems,2000,(52)87-104.
    [114]George Kollios.Efficient Biased Sampling for Approximate Clustering and Outlier Detection in Large Data Sets.IEEE Transactions on knowledge and data engineering,2003,15(5):226-237
    [115]Maulik U,Bandyopadhyay S,Performance evaluation of some clustering algorithms and validity indices,IEEE Transactions on Patern Analysis and Machine Intelligence, 2002,24(12):1650-1654
    [116]Li J,Liu H. Wong L.Use of built-in features in the interpretation of high-dimensional cancer diagnosis data.Proeeedings of the second conferenee on Asia-Pacific bioinformaties,2004,67-74
    [117]MeLaehlan G.J., Basford K.E.Inference and Applications to Clustering.NewYork Basel:Marcel Dekker Ine,1988,162-169
    [118]Mehta M,Agrawai R,Rissanen J.A Fast Scalable Classifier for Data Mining. 1996,18-32
    [119]Meila M.,Heekerman D.,An experimental comparison of model-based clustering methods.Machine Learning,2001,42(2):143-175
    [120]Nigma K,Mc Callum A,Thrun S et al,Text classification from labeled and unlabeled documents using EM. Machine Learning,2000,39(3):103-134
    [121]Nozaki K.A simple but powerful heuristic method for generating fuzzy rules from numerical data.Fuzzy sets and systems,1997,86(3):251-270
    [122]Ohta T,Yamakwa A.,Ichihashi H.et al,Projection pursuit switching regression. Proc.of 5th International Conference on Soft Computing,1998,775-778
    [123]Tao,C.W.,Unsupervised fuzzy elustering with multi-center clusters. Fuzzy Sets and Systems,2002,128:305-322
    [124]Robert W, Tian Sf. An Efficient Optimality Test of the Fuzzy C-Means Algorithm. Fuzzy sets and systems,2002,86(4):290-297
    [125]Taylor J.W,Roberto B.Neural network load forecasting with weather ensemble predictions.IEEE Trans on Power Systems,2001,17(3):626-632
    [127]Amjady N.Short-term hourly load forecasting using time-series modeling with peak load estimation capability[J].IEEE Trans on Power Systems,2001,16(3):498-505
    [128]Lu Y,Lin Xin.Short-time load forecasting based on grey model and correction system.IEEE/PES Transmission and Distribution Conference & Exhibition:Asia and pacific,China,2005,172-190
    [129]Du Tao,Wang Xiuli,Wang Xifan.A Combined model of wavelet and neural nerwork for short term load forecasting.IEEE Trans on Power Systems,2002,16(4):2331-2335

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700