基于关联规则的火电厂优化目标值确定的研究

英文题名：Research on the Thermal Power Plant Operation Optimization Value Determining Based on Association Rule
作者：郑西西
论文级别：硕士
学科专业名称：热能工程
中文关键词：数据挖掘 ; 关联规则 ; 优化目标值 ; 哈希表 ; 数据消减
英文关键词：data mining ; association rule ; optimization target ; hash table ; data reduction
学位年度：2011
导师：谷俊杰
学科代码：080702
学位授予单位：华北电力大学
论文提交日期：2010-12-01

摘要

火电机组运行优化目标值是关系到机组经济性的重要因素,它提供反映机组当前最佳运行状态的运行参数和性能指标,为运行优化操作指导提供了基础和依据。传统的优化目标值确定方法如设计值、变工况热力计算往往不能很好反映机组的实际运行状态。随着火电厂自动化程度的提高,其生产过程中数据采集和存储技术的有了长足发展,机组运行积累数据的规模和质量进一步提高,这些生产数据中蕴含着机组优化所需的大量状态信息。本文将关联规则挖掘应用于火电厂优化目标值的确定,在经典算法基础上,针对火电厂生产数据的特点,提出了基于哈希散列和数据表消减的改进算法,该算法利用哈希表的快速查找特性,通过直接扫描数据表生成候选项集,将项集及其支持频次存入哈希表,同时删除表中不需要的记录完成数据表的初步消减,扫描完成后移除哈希表中不满足最小支持频次的记录以产生频繁项集,并依据该轮次生成的频繁项集来完成对数据表的再次消减,从而减少后续的计算量,提高定量关联规则挖掘的效率。另外,本文给出了改进算法软件实现的总体结构框架以及采用的主要数据结构,最后对某1000MW机组的历史生产数据挖掘得出100%负荷下各参数的优化目标值,并对结果进行分析。结果表明,该方法获得的机组优化目标值与机理分析相一致,可以用来指导优化运行。
Optimization target is an important factor related to economic of thermal power unit. It provides operation parameters and performance indexes which reflect the best running of unit, and also provides the foundation and basis of optimization operation guide. Traditional determination of optimization target, such as design value, varying condition calculation are often can’t reflect the actual states of unit. With the continuous development of data acquisition and storage technology, the scale and quality of production data of thermal power plant are significantly improved, and the production data contains a large number of information required by operation optimization. This paper applies association rules to determine the optimization target of power plant, and proposed an improved algorithm based on hash table and data reduction according to the features of the production data of thermal power plant. The improved algorithm generates candidate itemsets by scanning a data table directly, and then pushes the itemsets and their support to a hash table, completes the first round data reduction by deleting needless records in the table simultaneously. After the scan is complete, the algorithm generate frequent itemsets by removing the records that don’t meet the minimum support from the hash table, and completes the second round data reduction according to the frequent itemsets for reducing the follow-up computation and improving the efficiency. In addition, this paper presents an overall framework and main data structure of a software implementation of the improved algorithm. As a practical example, optimal values for 100% load of a certain 1000MW power set are found with the software. The analysis of the results show it’s consistent with mechanism analysis, and can raise the power set’s efficiency.

引文

[1]裘国相.基于关联规则的电厂优化目标值确定的研究[D].大连海事大学, 2006
    [2]李建强,牛成林,刘吉臻.数据挖掘技术在火电厂优化运行中的应用[J].动力工程, 2006,26(6):830-835
    [3]李建强,牛成林,谷俊杰,等.数据挖掘在火电厂运行参数优化目标值确定中的应用[J].华北电力大学学报, 2008, 35(4):53-56
    [4]董杰.基于位表的关联规则挖掘及关联分类研究[D].大连理工大学, 2009:1
    [5]M. S. Chen, J. Han, P. S. Yu. Data Mining: An Overview from a Database Perspective[J]. IEEE Transactions on Knowledge and Data Engineering, 1996(8):866-883
    [6]S. J. Lee, K. Siau. A Review of Data Mining Techniques[J]. Industrial Management & Data Systems, 2001, 101(1-2):41-46
    [7]ugmbbc. Internet总数据量接近5000亿GB增长速度高过NASA火箭[OL]. (2009-05-20). [2010-10-12]. http://www.cnbeta.com/articles/84676.htm
    [8]郭攀. Intel:全球数据量暴增YB时代将到来[OL]. (2010-05-06). [2010-10-12]. http://www.pcpop.com/doc/0/528/528900.shtml
    [9]Pchome.全球数据量急剧膨胀数据存储需要绿化[OL]. (2007-6-19). [201010-12]. http://www.enet.com.cn/article/2007/0619/A20070619675791.shtml
    [10]X. D. Wu, V. Kumar, J. R. Quinlan, et al. Top 10 Algorithms in Data Mining[J]. Knowledge and Information System, 2008, 14(1):1-37
    [11]马超飞.基于关联规则的遥感数据挖掘与应用[D].中国科学院研究生院(遥感应用研究所), 2002:2
    [12]郑继刚,王边疆.数据挖掘的研究现状与发展趋势[J].红河学院学报, 2010(2):45-48
    [13]彭斌.基于关联规则的基因芯片数据挖掘与应用.第三军医大学,2008:4
    [14]施韦德.客户的游艇在哪里[M].北京:机械工业出版社,2007
    [15]朱明.数据挖掘[M].合肥:中国科学技术大学出版社,2002
    [16]孙铁民,于杰,尚程,等.基于无监督学习的数据清洗算法[J].吉林大学学报(信息科学版), 2008, 26(6):599-604
    [17]S. H. Liao, H. H. Ho, H. W. Lin. Mining Stock Category Association and Cluster on Taiwan Stock Market[J]. Expert Systems with Applications, 2008, 35(1-2):19-29
    [18]D. Haughton, J. Deichmann, A. Eshghi, et al. A Review of Software Packages for Data Mining[J]. American Statistician, 2003, 57(4):290-309
    [19]K. Wang. Applying Data Mining to Manufacturing: the Nature and Implications[J]. Journal of Intelligent Manufacturing, 2007, 18(4):487-495
    [20]E. W. Klee. Data Mining for Biomarker Development: A Review of Tissue Specificity Analysis[J]. Clinics in Laboratory Medicine, 2008, 28(1):127-143
    [21]百度百科.数据挖掘[OL]. (2010-08-13). [2010-10-15] http://baike.baidu.com/view/69860.htm?fr=ala0_1
    [22]D. Taniar, W. Rahayu, V. Lee, et al.Exception Rules in Association Rule Mining[J]. Applied Mathematics and Computation, 2008, 205(2):735-750
    [23]Y. L. Chen, C. H. Weng. Mining Fuzzy Association Rules from Questionnaire Data[J]. Knowledge-Based Systems, 2009, 22(1):46-56
    [24]D. Sanchez, M. A. Vila, L. Cerda, et al.Association Rules Applied to Credit Card Fraud Detection[J]. Expert Systems with Applications, 2009, 36(2):3630-3640
    [25]刘亚波.关联规则挖掘方法的研究及应用[D].吉林大学, 2005:7-8
    [26]R. Agrawal, T. Imielinski, A. Swami. Mining Association Rules between Sets of Items in Large Databases[C]. Proceedings of the ACM SIGMOD International Conference on Management of Data,Washington,DC,USA,1993:207-216
    [27]R. Agrawal, R. Srikant. Fast Algorithms for Mining Association Rules[C]. Proceedings of the 20th International Conference on Very Large Databases, Santiago, Chile, 1994:487-499
    [28]J. Han, J. Pei, Y. Yin. Mining Frequent Patterns without Candidate Generation[C]. 2000 ACM SIGMOD International Conference on Management of Data, Dallas, Texas, USA, 2000:1-12
    [29]J. Pei, J. Han, H. Lu, et al. H-Mine: Hyper-structure Mining of Frequent in Large Database[A]. Proceedings of the 2001 International Conference on Data Mining, San Jose, CA, 2001:38-49
    [30]Junqiang Liu, Yunhe Pan, Ke Wang, et al. Mining Frequent Item Sets by Opportunistic Projection[C]. SIGKDD’02, Edmonton, Alberta, Canada, 2002
    [31]Y. C. Sucahyo, R. P. Gopala. CT-ITL: Efficient Frequent Item Set Mining using a Compressed Prefix Tree with Pattern Growth[C]. Proceedings of the 14th Australiasian Database Conference,2003
    [32]沈斌.关联规则相关技术研究.浙江大学, 2007:1-3
    [33]王启龙.火电厂汽轮发电机组故障诊断系统的研究与开发[D].东南大学, 2000.
    [34]董学育.基于人工神经网络的故障诊断方法在电站中的应用研究[D].东南大学,2000
    [35]叶飞跃.关联规则及其元规则挖掘技术研究[D].南京航空航天大学, 2006:1,9
    [36]郭秀娟.基于关联规则数据挖掘算法的研究[D].吉林大学, 2004:30
    [37]A. Savasere, E. Omiecinski, S. Navathe. An Efficient Algorithm for MiningAssociation Rules in Large Databases[C]. Proceedings of the 20th International Conference on very Large Databases (VLDB’95), 1995:432-443
    [38]H. Toivonen. Sampling Large Databases for Association Rules[C]. Proceedings of the 22nd International Conference on Very Large Databases(VLDB’96), Bombay, India, 1996:134-145
    [39]马如林,蒋华,张庆霞.一种哈希表快速查找的改进方法[J].计算机工程与科学, 2008, 30(9):66-68
    [40]骆剑锋.哈希表与一般查找方法的比较及冲突的解决[J].十堰职业技术学院学报, 2007,20(5):96-98
    [41]严蔚敏,吴伟民.数据结构(C语言版)[M].北京:清华大学出版社, 2007
    [42]巴利纳. Visual Basic 2005技术内幕[M].北京:清华大学出版社,2006:349
    [43]毛国君,段立娟,王实,等.数据挖掘原理与算法[M].北京:清华大学出版社, 2007:106
    [44]福勒.企业应用架构模式[M].北京:机械工业出版社, 2004
    [45]张洋.细说业务逻辑[OL]. (2009-10-31). [2010-11.20]. http://www.cnblogs.com/leoo2sk/archive/2009/10/31/1593740.html
    [46]姚家奕.数据仓库与数据挖掘技术原理及应用[M].北京:电子工业出版社, 2009:2

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700