基于改进K-means聚类k值选择算法的配网电压数据异常检测
CSTR:
作者:
中图分类号:

TM76

基金项目:

中国南方电网有限责任公司科技项目(YNKJXM20191369)


Anomaly detection of distribution network voltage data based on improved K-means clustering k-value selection algorithm
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [16]
  • | | | |
  • 文章评论
    摘要:

    K-means聚类算法因计算速度快、准确率高等优势被应用于大规模配电网数据异常检测,但当聚类数不合适时,可能导致聚类结果不理想。为此,提出一种基于改进elbow method和轮廓系数的聚类数选择算法IES,首先,该算法利用elbow method的聚类评价指标和聚类数上限,确定随数据集不同而自适应变化的阈值,通过自适应阈值求解聚类数下限;其次,在聚类数上下限内计算轮廓系数,并提出“一个极大值”规则避免计算所有轮廓系数,提高算法速度;最后,利用轮廓系数选取合适聚类数,并通过召回率评价异常检测效果,说明为K-means聚类算法选取合适聚类数对异常检测的重要性。算例结果表明:IES算法能在自适应获取最佳聚类数的同时大大削减计算时间,提高K-means算法在线监测的准确率和高效性。

    Abstract:

    K-means clustering algorithm has been applied to anomaly detection of large-scale distribution network data due to its advantages of fast computation speed and high accuracy. However, the algorithm may lead to an inaccurate clustering if the assumed clustering number is not appropriate. Therefore, this paper presents a clustering number selection algorithm IES based on the improved elbow method and silhouette coefficient (IES). Firstly, the clustering evaluation index of the elbow method and the upper limit of clustering number are utilized to set a threshold which can adaptively change with data sets. With this threshold, the lower limit of clustering number can be obtained. Secondly, the silhouette coefficient calculated within the upper and lower limit of the clustering number. An “one maximum” rule is proposedin order to improve the algorithm speed and avoid calculating all the silhouette coefficients. In the end, the calculated silhouette coefficients are utilized to select the appropriate clustering number. In addition, the recall rate is employed to evaluate the anomaly detection and illustrate the importance of selecting appropriate clustering number for K-means anomaly detection. Simulation results show that the IES algorithm can obtain the optimal clustering number adaptively, meantime, greatly shorten the calculation time, and improve the accuracy and efficiency of the K-means algorithm in online monitoring.

    参考文献
    [1] 邓鹏,刘敏.基于改进聚类和RBF神经网络的台区电网线损计算研究[J].智慧电力,2021,49(2):107-113.DENG Peng,LIU Min.Power line loss calculation in low voltage region based on improved clustering algorithm and RBF neural network[J].Smart Power,2021,49(2):107-113.
    [2] 刘君,余思伍,陈沛龙,等.基于聚类分析的变压器有载分接开关储能弹簧故障识别[J].高压电器,2020,56(7):159-165+172.LIU Jun,YU Siwu,CHEN Peilong,et al.Fault recognition for on-load tap changer storage spring of power transformer by clustering analysis algorithm[J].High Voltage Apparatus,2020,56(7):159-165+172.
    [3] 韩帅,孙乐平,杨艺云,等.基于改进K-Means聚类和误差反馈的数据清洗方法[J].电网与清洁能源,2020,36(7):9-15.HAN Shuai,SUN Leping,YANG Yiyun,et al.A data cleaning method based on improved K-Means clustering and error feedback[J].Power System and Clean Energy,2020,36(7):9-15.
    [4] 侯庆春,杜尔顺,田旭,等.数据驱动的电力系统运行方式分析[J].中国电机工程学报,2021,41(1):1-12.HOU Qingchun,DU Ershun,TIAN Xu,et al.Data-driven power system operation mode analysis[J].Proceedings of the CSEE,2021,41(1):1-12.
    [5] 秦佳倩,唐海国,张帝,等.加权模糊C均值聚类和主客观赋权结合的厂用电关联特征挖掘方法[J].电力科学与技术学报,2020,35(4):122-127.QIN Jiaqian,TANG Haiguo,ZHANG Di,et al.Auxiliary power consumption feature mining method weighted fuzzy C-means clustering and subjective and objective weighting combined[J].Journal of Electric Power Science and Technology,2020,35(4):122-127.
    [6] 尚学军,霍现旭,郑晓冬,等.基于离散小波分析与K-means聚类算法的MMC-HVDC输电线路保护方案[J].电测与仪表,2020,57(24):52-57.SHANG Xuejun,HUO Xianxu,ZHENG Xiaodong,et al.MMC-HVDC transmission line protection scheme based on discrete wavelet analysis and K-means clustering algorithm[J].Electrical Measurement & Instrumentation,2020,57(24):52-57.
    [7] PETER R J.Silhouettes:a graphical aid to the interpretation and validation of cluster analysis[J].Journal of Computational and Applied Mathematics,1987,20:53-65.
    [8] 刘洋,刘洋,许立雄,等.计及数据类别不平衡的海量用户负荷典型特征高性能提取方法[J].中国电机工程学报,2019,39(14):4093-4104.LIU Yang,LIU Yang,XU Lixiong,et al.A high performance extraction method for massive user load typical characteristics considering data class imbalance[J].Proceedings of the CSEE,2019,39(14):4093-4104.
    [9] 宋军英,崔益伟,李欣然,等.改进分段线性表示与动态时间弯曲相结合的负荷曲线聚类方法[J].电力系统自动化,2021,45(2):89-96.SONG Junying,CUI Yiwei,LI Xinran,et al.Load curve clustering method combining improved piecewise linear representation and dynamic time warping[J].Automation of Electric Power Systems,2021,45(2):89-96.
    [10] 黄雨薇,彭道刚,姚峻,等.基于SSA和K均值的TD-BP神经网络超短期光伏功率预测[J].太阳能学报,2021,42(4):229-238.HUANG Yuwei,PENG Daogang,YAO Jun,et al.Ultra-short-term photovoltaic power forecast of TD-BP neural network based on SSA and K-means[J].Acta Energiae Solaris Sinica,2021,42(4):229-238.
    [11] 赵晶晶,贾然,陈凌汉,等.基于深度学习和改进K-means聚类算法的电网无功电压快速分区研究[J].电力系统保护与控制,2021,49(14):89-95.ZHAO Jingjing,JIA Ran,CHEN Linghan,et al.Research on fast partition of reactive power and voltage based on deep learning and an improved K-means clustering algorithm[J].Power System Protection and Control,2021,49(14):89-95.
    [12] TIBSHIRANI R,HASTIE W T.Estimating the number of clusters in a data set via the gap statistic[J].Journal of the Royal Statistical Society,Series B(Methodological),2001,63(2):411-423.
    [13] BREIMAN L I,FRIEDMAN J H,OLSHEN R A,et al.Classification and regression trees[J].Biometrics,1984,40(3):358.
    [14] 钱宇骋,甄超,季坤,等.变压器在线监测数据异常值检测与清洗[J].哈尔滨理工大学学报,2020,25(5):15-22.QIAN Yucheng,ZHEN Chao,JI Kun,et al.Transformer online monitoring data abnormal value detection and cleaning[J].Journal of Harbin University of Science and Technology,2020,25(5):15-22.
    [15] 严英杰,盛戈皞,刘亚东,等.基于滑动窗口和聚类算法的变压器状态异常检测[J].高电压技术,2016,42(12):4020-4025.YAN Yingjie,SHENG Gehao,LIU Yadong,et al.Anomalous state detection of power transformer based on algorithm sliding windows and clustering[J].High Voltage Engineering,2016,42(12):4020-4025.
    [16] 侯慧,耿浩,肖祥,等.台风灾害下用户停电区域预测及评估[J].电网技术,2019,43(6):1948-1954.HOU Hui,GENG Hao,XIAO Xiang,et al.Research on prediction and evaluation of user power outage area under typhoon disaster[J].Power System Technology,2019,43(6):1948-1954.
    相似文献
    引证文献
引用本文

刘明群,何鑫,覃日升,等.基于改进K-means聚类k值选择算法的配网电压数据异常检测[J].电力科学与技术学报,2022,37(6):91-99.
LIU Mingqun, HE Xin, QIN Risheng, et al. Anomaly detection of distribution network voltage data based on improved K-means clustering k-value selection algorithm[J]. Journal of Electric Power Science and Technology,2022,37(6):91-99.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 在线发布日期: 2023-01-16
文章二维码