基于协同奖励函数多目标强化学习的智能频率控制策略研究
作者:
作者单位:

(1.国网河南省电力公司,河南 郑州 450052;2.中国电力科学研究院有限公司,北京 100192;3.国网浙江省电力有限公司松阳县供电公司,浙江 松阳 323400;4.武汉大学电气与自动化学院,湖北 武汉 430072)

作者简介:

通讯作者:

付希越(1999—),女,硕士研究生,主要从事电力系统运行与控制研究;E?mail:fuxiyue@whu.edu.cn

中图分类号:

TM933

基金项目:

国家重点研发计划(2017YFB0902600)


Intelligent frequency control strategy based on multi‑objective reinforcement learning of cooperative reward function
Author:
Affiliation:

(1.State Grid Henan Electric Power Company, Zhengzhou 450052,China;2.China Electric Power Research Institute, Beijing 100192, China;3.Songyang Power Supply Company, State Grid Zhejiang Electric Power Co., Ltd., Songyang 323400, China; 4.School of Electrical Engineering and Automation, Wuhan University, Wuhan 430072, China)

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    在含大规模风电并网系统的智能频率控制策略中,仅考虑CPS控制准则易造成频率短期集中越限,严重影响智能自动发电控制(AGC)策略的控制效果。提出一种基于协同奖励函数的多目标强化学习(TOPQ?MORL)智能频率控制策略,该策略构建了计及多维度频率控制性能评价标准的协同奖励函数,实现了多维度频率控制性能标准在时间尺度上的配合评价。采用TOPQ学习策略对智能体动作空间进行全局寻优,有效解决了传统贪婪策略下的Q函数线性加权多目标强化学习算法运算效率不佳的问题。标准两区域互联电网AGC控制模型仿真研究结果表明:所提智能AGC控制策略能有效改善频率控制性能,显著提高系统在全时间尺度上的频率质量。

    Abstract:

    In the intelligent frequency control strategy with large?scale wind power grid?connected system, only considering the CPS control criterion can easily cause the frequency off?limit in a short time, which seriously affects the control effect of the intelligent AGC control strategy. This paper proposes a multi?objective collaborative reward function reinforcement learning algorithm (TOPQ?MORL) intelligent frequency control strategy, which constructs a collaborative reward function that takes into account the multi?dimensional frequency control performance evaluation criteria, and realizes the coordinating evaluation of multi?dimensional frequency control performance standards on the time scale .The TOPQ learning strategy is used to optimize the action space of the agent globally, which effectively solves the problem of poor calculation efficiency of the Q function linear weighted multi?objective reinforcement learning algorithm under the traditional greedy strategy. The simulation results of the AGC control model of the standard two?region interconnected power grid shows that the intelligent AGC control strategy proposed in this paper can effectively improve the frequency control performance and improve the frequency quality of the system on the full?time scale obviously.

    参考文献
    相似文献
    引证文献
引用本文

韩保军,高 强,代 飞,等.基于协同奖励函数多目标强化学习的智能频率控制策略研究[J].电力科学与技术学报,2023,38(2):18-29.
HAN Baojun, GAO Qiang, DAI fei, et al. Intelligent frequency control strategy based on multi‑objective reinforcement learning of cooperative reward function[J]. Journal of Electric Power Science and Technology,2023,38(2):18-29.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2023-06-29
  • 出版日期:
文章二维码