基于协同奖励函数多目标强化学习的智能频率控制策略研究
CSTR:
作者:
作者单位:

(1.国网河南省电力公司,河南 郑州 450052;2.中国电力科学研究院有限公司,北京 100192;3.国网浙江省电力有限公司松阳县供电公司,浙江 松阳 323400;4.武汉大学电气与自动化学院,湖北 武汉 430072)

通讯作者:

付希越(1999—),女,硕士研究生,主要从事电力系统运行与控制研究;E?mail:fuxiyue@whu.edu.cn

中图分类号:

TM933

基金项目:

国家重点研发计划(2017YFB0902600)


Intelligent frequency control strategy based on multi‑objective reinforcement learning of cooperative reward function
Author:
Affiliation:

(1.State Grid Henan Electric Power Company, Zhengzhou 450052,China;2.China Electric Power Research Institute, Beijing 100192, China;3.Songyang Power Supply Company, State Grid Zhejiang Electric Power Co., Ltd., Songyang 323400, China; 4.School of Electrical Engineering and Automation, Wuhan University, Wuhan 430072, China)

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [35]
  • | | | |
  • 文章评论
    摘要:

    在含大规模风电并网系统的智能频率控制策略中,仅考虑CPS控制准则易造成频率短期集中越限,严重影响智能自动发电控制(AGC)策略的控制效果。提出一种基于协同奖励函数的多目标强化学习(TOPQ?MORL)智能频率控制策略,该策略构建了计及多维度频率控制性能评价标准的协同奖励函数,实现了多维度频率控制性能标准在时间尺度上的配合评价。采用TOPQ学习策略对智能体动作空间进行全局寻优,有效解决了传统贪婪策略下的Q函数线性加权多目标强化学习算法运算效率不佳的问题。标准两区域互联电网AGC控制模型仿真研究结果表明:所提智能AGC控制策略能有效改善频率控制性能,显著提高系统在全时间尺度上的频率质量。

    Abstract:

    In the intelligent frequency control strategy with large?scale wind power grid?connected system, only considering the CPS control criterion can easily cause the frequency off?limit in a short time, which seriously affects the control effect of the intelligent AGC control strategy. This paper proposes a multi?objective collaborative reward function reinforcement learning algorithm (TOPQ?MORL) intelligent frequency control strategy, which constructs a collaborative reward function that takes into account the multi?dimensional frequency control performance evaluation criteria, and realizes the coordinating evaluation of multi?dimensional frequency control performance standards on the time scale .The TOPQ learning strategy is used to optimize the action space of the agent globally, which effectively solves the problem of poor calculation efficiency of the Q function linear weighted multi?objective reinforcement learning algorithm under the traditional greedy strategy. The simulation results of the AGC control model of the standard two?region interconnected power grid shows that the intelligent AGC control strategy proposed in this paper can effectively improve the frequency control performance and improve the frequency quality of the system on the full?time scale obviously.

    参考文献
    [1] 胡泽春,罗浩成.大规模可再生能源接入背景下自动发电控制研究现状与展望[J].电力系统自动化,2018,42(8):2?15. HU Zechun,LUO Haocheng.Research status and prospect of automatic generation control with integration of large?scale renewable energy[J].Automation of Electric Power Systems,2018,42(8):2?15.
    [2] 谢小荣,贺静波,毛航银,等.“双高”电力系统稳定性的新问题及分类探讨[J].中国电机工程学报,2021,41(2):461?475. XIE Xiaorong,HE Jingbo,MAO Hangyin,et al.New issues and classification of power system stability with high shares of renewables and power electronics[J].Proceedings of the CSEE,2021,41(2):461?475.
    [3] 王念,张靖,李博文,等.自动发电控制研究综述[J].电测与仪表,2020,57(21):1?8. WANG Nian,ZHANG Jing,LI Bowen,et al.Research review of automatic generation control[J].Electrical Measurement & Instrumentation,2020,57(21).
    [4] 徐艳春,蒋伟俊,孙思涵,等.含高渗透率风电的配电网暂态电压量化评估方法[J].中国电力,2022,55(7):152?162. XU Yanchun,JIANG Weijun,SUN Sihan,et al.Quantitative assessment method for transient voltage of distribution network with high?penetration wind power[J].Electric Power,2022,55(7):152?162.
    [5] 杨建宾,谢丽蓉,宋新甫,等.基于可再生能源的碳捕集—电转气协同运行方法[J].智慧电力,2022,50(12):70?78. YANG Jianbin,XIE Lirong,SONG Xinfu,et al.Collaborative operation method of carbon capture?P2G based on renewable energy[J].Smart Power,2022,50(12):70?78.
    [6] 阮正鑫,张逸,张嫣,等.高比例光伏与配电网超高次谐波交互影响研究[J].电力工程技术, 2021,40(2):18?25. RUAN Zhengxin, ZHANG Yi, ZHANG Yan, et al.Interaction of high proportion photovoltaic and supraharmonic in distribution network[J].Electric Power Engineering Technology, 2021, 40(2): 18?25
    [7] WATKINS C J C H,DAYAN P.Q?learning[J].Machine Learning,1992,8(3?4):279?292.
    [8] 谢庆,张煊宇,王春鑫,等.新一代人工智能技术在输变电设备状态评估中的应用现状及展望[J].高压电器,2022,58(11):1?16. XIE Qing,ZHANG Xuanyu,WANG Chunxin,et al.Application status and prospect of the new generation artificial intelligence technology in the state evaluation of power transmission and transformation equipment[J].High Voltage Apparatus,2022,58(11):1?16.
    [9] 程乐峰,余涛,张孝顺,等.机器学习在能源与电力系统领域的应用和展望[J].电力系统自动化,2019,43(1):15?31. CHENG Lefeng,YU Tao,ZHANG Xiaoshun,et al.Machine learning for energy and electric power systems:state of the art and prospects[J].Automation of Electric Power Systems,2019,43(1):15?31.
    [10] 张廷锋,陶熠昆,何凛,等.基于遗传算法的电力巡检机器人作业调度优化方法[J].电网与清洁能源,2022,38(3):68?73. ZHANG Tingfeng,TAO Yikun,HE Lin,et al.A genetic algorithm?based optimization method for job scheduling of electric power inspection robots[J].Power System and Clean Energy,2022,38(3):68?73.
    [11] 梁煜东,陈峦,张国洲,等.基于深度强化学习的多能互补发电系统负荷频率控制策略[J].电工技术学报,2022,37(7):1768?1779. LIANG Yudong,CHEN Luan,ZHANG Guozhou,et al.Load frequency control strategy of hybrid power generation system:a deep reinforcement learning?based approach[J].Transactions of China Electrotechnical Society,2022,37(7):1768?1779.
    [12] 杨丽,孙元章,徐箭,等.基于在线强化学习的风电系统自适应负荷频率控制[J].电力系统自动化,2020,44(12):74?83. YANG Li,SUN Yuanzhang,XU Jian,et al.Adaptive load frequency control of wind power system based on online reinforcement learning[J].Automation of Electric Power Systems,2020,44(12):74?83.
    [13] 余涛,周斌,陈家荣.基于Q学习的互联电网动态最优CPS控制[J].中国电机工程学报,2009,29(19):13?19. YU Tao,ZHOU Bin,CHAN Kawing.Q learning based optimal dynamic optimal CPS control methodology for interconnected power systems[J].Proceedings of the CSEE,2009,29(19):13?19.
    [14] YU T,ZHOU B,CHAN K W,et al.Stochastic optimal relaxed automatic generation control in non?markov environment based on multi?step Q(λ) learning[J].IEEE Transactions on Power Systems,2011,26(3):1272?1282.
    [15] YIN L F,YU T,ZHOU L,et al.Artificial emotional reinforcement learning for automatic generation control of large?scale interconnected power grids[J].IET Generation, Transmission & Distribution,2017,11(9):2305?2313.
    [16] 殷林飞,余涛.基于深度Q学习的强鲁棒性智能发电控制器设计[J].电力自动化设备,2018,38(5):12?19. YIN Linfei,YU Tao.Design of strong robust smart generation controller based on deep Q learning[J].Electric Power Automation Equipment,2018,38(5):12?19.
    [17] 席磊,余璐,付一木,等.基于探索感知思维深度强化学习的自动发电控制[J].中国电机工程学报,2019,39(14):4150?4162. XI Lei,YU Lu,FU Yimu,et al.Automatic power generation control based on deep reinforcement learning with exploration awareness[J].Proceedings of the CSEE,2019,39(14):4150?4162.
    [18] 黄超,卜思齐,陈麒宇,等.元电力:新一代智能电网[J].发电技术,2022,43(2):287?304. HUANG Chao, BU Siqi, CHEN Qiyu, et al.Meta?power:next?generation smart grid[J].Power Generation Technology,2022,43(2):287?304.
    [19] WANG C X,MCCALLEY J D.Impact of wind power on control performance standards[J].International Journal of Electrical Power & Energy Systems,2013,47:225?234.
    [20] NERC.BAL?001?2?real power balancing control performance standard background document[EB/OL].North America:NERC,2013[2015?02?01].http://www.nerc.com/.
    [21] 谈超,戴则梅,滕贤亮,等.北美频率控制性能标准发展分析及其对中国的启示[J].电力系统自动化,2015,39(18):1?7. TAN Chao,DAI Zemei,TENG Xianliang,et al.Development of frequency control performance standard in North America and its enlightenment to China[J].Automation of Electric Power Systems,2015,39(18):1?7.
    [22] 常烨骙,刘娆,巴宇,等.平衡监管区区域控制偏差限制标准剖析[J].电网技术,2016,40(1):256?262. CHANG Yekui,LIU Rao,BA Yu,et al.Analysis of balancing authority ACE limit standard of North America[J].Power System Technology,2016,40(1):256?262.
    [23] WANG H Z,LEI Z X,ZHANG X,et al.Multiobjective reinforcement learning?based intelligent approach for optimization of activation rules in automatic generation control[J].IEEE Access,2019,7:17480?17492.
    [24] VAMPLEW P,YEARWOOD J,DAZELEY R,et al.On the limitations of scalarisation for multi?objective reinforcement learning of pareto fronts[C]//Proceedings of the 21st Australasian Joint Conference on Artificial Intelligence:Advances in Artificial Intelligence.New York:ACM,2008.
    [25] LI J W,YU T,ZHANG X S.Coordinated load frequency control of multi?area integrated energy system using multi?agent deep reinforcement learning[J].Applied Energy,2022,306:117900.
    [26] 李瑞群,王若冰,田涛,等.多智能体同时到达多目标点的协同强化学习算法[J].计算机应用与软件,2021,38(9):199?204. LI Ruiqun,WANG Ruobing,TIAN Tao,et al.Collaborative reinforcement learning algorithm of multi?agent achieving simultaneous multi?objectives[J].Computer Applications and Software,2021,38(9):199?204.
    [27] 部俊锋,李昌卫,刘浩.二次再热机组一次调频能力探讨[J].山东电力技术,2021,48(12):68?71. BU Junfeng,LI Changwei,LIU Hao.Discussion on the primary frequency control performance of double reheat unit[J].Shandong Electric Power,2021,48(12):68?71.
    [28] 赵知劲,朱家晟,叶学义,等.基于多智能体模糊深度强化学习的跳频组网智能抗干扰决策算法[J].电子与信息学报,2021,43:2?9. ZHAO Zhijin,ZHU Jiasheng,YE Xueyi,et al.Intelligent anti?jamming decision algorithm for frequency hopping network based on multi?agent fuzzy deep reinforcemnet learning[J].Journal of Electronics & Information Technology,2022,43:2?9.
    [29] LIU C M,XU X,HU D W.Multiobjective reinforcement learning:a comprehensive overview[J].IEEE Transactions on Systems,Man and Cybernetics:Systems,2015,45(3):385?398.
    [30] 姜媛媛,张振振,薛生,等.改进组合赋权法的配电网隐患评估[J].科学技术与工程,2020,20(22):9030?9035. JIANG Yuanyuan,ZHANG Zhenzhen,XUE Sheng,et al.Evaluation of distribution network hidden dangers by improved combination weighting method[J].Science Technology and Engineering,2020,20(22):9030?9035.
    [31] 贺春光,檀晓林,周兴华,等.基于博弈论组合赋权的智能配电网项目投资效益评价[J].电力科学与技术学报,2022,37(1):161?167. HE Chunguang,TAN Xiaolin,ZHOU Xinghua,et al.Investment benefit evaluation of intelligent distribution network project based on game theory combination weighting[J].Journal of Electric Power Science and Technology,2022,37(1):161?167.
    [32] 赵洪山,李静璇,米增强,等.基于CRITIC和改进Grey?TOPSIS的电能质量分级评估方法[J].电力系统保护与控制,2022,50(3):1?8. ZHAO Hongshan,LI Jingxuan,MI Zengqiang,et al.Grading evaluation of power quality based on CRITIC and improved Grey?TOPSIS[J].Power System Protection and Control,2022,50(3):1?8.
    [33] IMTHIAS AHAMED T P,NAGENDRA RAO P,SASTRY P S.A reinforcement learning approach to automatic generation control[J].Electric Power Systems Research,2002,63(1) :9?26.
    [34] 李卫东,刘娆,巴宇.新一代互联电网运行控制性能评价标准设计的理论基础与工作展望[J].电力科学与技术学报,2011,26(1):13?19+26. LI Weidong,LIU Rao,BA Yu.Theory and prospect of performance evaluation criterion design for new interconnected power grid operation and control[J].Journal of Electric Power Science and Technology,2011,26(1):13?19+26.
    [35] SUTTON R S,BARTO A G.Reinforcement learning :an introduction[M].Cambridge:MIT Press,1998:60.
    相似文献
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

韩保军,高 强,代 飞,等.基于协同奖励函数多目标强化学习的智能频率控制策略研究[J].电力科学与技术学报,2023,38(2):18-29.
HAN Baojun, GAO Qiang, DAI fei, et al. Intelligent frequency control strategy based on multi‑objective reinforcement learning of cooperative reward function[J]. Journal of Electric Power Science and Technology,2023,38(2):18-29.

复制
分享
文章指标
  • 点击次数:286
  • 下载次数: 2347
  • HTML阅读次数: 0
  • 引用次数: 0
历史
  • 在线发布日期: 2023-06-29
文章二维码