基于海量用电数据的用户负荷模式快速提取方法研究
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Study on fast extraction method of user load pattern based on massive data
  • 作者:卢锦玲 ; 马冲 ; 冯翠香
  • 英文作者:LU Jinling;MA Chong;FENG Cuixiang;School of Electrical and Electronic Engineering,North China Electric Power University;
  • 关键词:大数据 ; 异常数据 ; 数据处理 ; 聚类有效性指标 ; 聚类算法 ; 负荷模式
  • 英文关键词:big data;;abnormal data;;data processing;;clustering validity index;;clustering algorithm;;load pattern
  • 中文刊名:DLQB
  • 英文刊名:Electric Power Science and Engineering
  • 机构:华北电力大学电气与电子工程学院;
  • 出版日期:2018-04-28
  • 出版单位:电力科学与工程
  • 年:2018
  • 期:v.34;No.216
  • 语种:中文;
  • 页:DLQB201804008
  • 页数:8
  • CN:04
  • ISSN:13-1328/TK
  • 分类号:53-60
摘要
对用电大数据进行快速、准确、高效的挖掘,是得到用户负荷模式不可或缺的基础工作。首先分析了用电数据的分布特点,利用统计学中四分位法的快速性和3σ法的精确性,提出了一种"横向—纵向"检测法,对异常用电数据进行检测与修正;其次,在综合对比了几种典型降维方法的基础上,采用主成分分析法对海量高维用电数据进行降维处理将极大地提高负荷模式提取效率;最后,对传统K-means算法进行改进,得到一种Fast K-means(FK-means)算法,该方法为减小聚类时间引入二分法思想,为提高聚类结果可靠性,将聚类有效性指标DBI与CHI相结合。采用中国南方某市实际量测用电数据验证了该算法能够快速对负荷模式进行提取且具有鲁棒性好的特点。
        The work of mining power big data quickly,accurately and efficiently is an indispensable basic task to get the load pattern of users. Firstly,this paper analyzes the distribution characteristics of electricity data. By using the fastness of Quarterback method and the accuracy of 3σ method,a"horizontal-vertical"detection method is proposed to detect and correct the abnormal electricity data.Secondly, after comparing several typical methods for dimensionality reduction, the principal component analysis is adopted to reduce the dimensionality of massive high-dimensional electricity data,which will greatly improve the efficiency of load pattern extraction. Then,an improved FKmeans algorithm is obtained based on the traditional K-means algorithm. In this method,the idea of dichotomy for reducing the clustering time is introduced. To improve the reliability of clustering results,the clustering validity indexes DBI and CHI are introduced. Finally,the characteristics of fast extracting load pattern and good robustness of the algorithm are verified by using the actual measurement of electricity in a city in southern China.
引文
[1]DEFU C.Electric power big data and its applications[A].Science and Engineering Research Center.Proceedings of 2016 International Conference on Energy,Power and Electrical Engineering(EPEE2016)[C].Science and Engineering Research Center,2016:4.
    [2]王桂兰,周国亮,赵洪山,等.大规模用电数据流的快速聚类和异常检测技术[J].电力系统自动化,2016,40(24):27-33.
    [3]宋亚奇,周国亮,朱永利.智能电网大数据处理技术现状与挑战[J].电网技术,2013,37(4):927-935.
    [4]张素香,赵丙镇,王风雨,等.海量数据下的电力负荷短期预测[J].中国电机工程学报,2015,35(1):37-42.
    [5]冯丽,邱家驹.基于电力负荷模式分类的短期电力负荷预测[J].电网技术,2005(4):23-26.
    [6]赵莉,候兴哲,胡君,等.基于改进k-means算法的海量智能用电数据分析[J].电网技术,2014,38(10):2715-2720.
    [7]张斌,庄池杰,胡军,等.结合降维技术的电力负荷曲线集成聚类算法[J].中国电机工程学报,2015,35(15):3741-3749.
    [8]王德文,周昉昉.基于无监督极限学习机的用电负荷模式提取[J/OL].电网技术,1-8[2017-12-24].https://doi.org/10.13335/j.1000-3673.pst.2017.1644.
    [9]赵岩,李磊,刘俊勇,等.上海电网需求侧负荷模式的组合识别模型[J].电网技术,2010,34(1):145-151.
    [10]陆俊,朱炎平,彭文昊,等.智能用电用户行为分析特征优选策略[J].电力系统自动化,2017,41(5):58-63.
    [11]冯晓蒲,张铁峰.基于实际负荷曲线的电力用户分类技术研究[J].电力科学与工程,2010,26(9):18-22.
    [12]VIEGAS J L,VIEIRA S M,SOUSA J M C.Electricity demand profile prediction based on household characteristics[C]//European Energy Market.IEEE,2015:1-5.
    [13]张素香,刘建明,赵丙镇,等.基于云计算的居民用电行为分析模型研究[J].电网技术,2013,37(6):1542-1546.
    [14]朱文俊,王毅,罗敏,等.面向海量用户用电特性感知的分布式聚类算法[J].电力系统自动化,2016,40(12):21-27.
    [15]张欣,高卫国,苏运.基于函数型数据分析和kmeans算法的电力用户分类(英文)[J].电网技术,2015,39(11):3153-3162.
    [16]刘思,李林芝,吴浩,等.基于特性指标降维的日负荷曲线聚类分析[J].电网技术,2016,40(3):797-803.
    [17]刘莉,王刚,翟登辉.k-means聚类算法在负荷曲线分类中的应用[J].电力系统保护与控制,2011,39(23):65-68.
    [18]谭璐.高维数据的降维理论及应用[D].北京:国防科学技术大学,2005.
    [19]谢娟英.无监督学习方法及其应用[M].北京:电子工业出版社,2016.