Abstract:In optimization dispatch of integrated energy systems, source?load uncertainty is always a difficult problem. Aiming at the source?load uncertainty fluctuation issue, a proximal policy optimization dispatching method is proposed based on data?depth reinforcement learning. Ensuring user requirements are satisfied, this method achieves the optimal cost for the integrated energy system while reducing the total carbon emissions under a tiered carbon trading framework. Firstly, with the objective of considering system overall cost, including carbon trading fees under a tiered carbon trading framework, a comprehensive demand response model for multiple types of flexible loads is established to enhance the responsiveness and scheduling flexibility of demand response. Then, within the framework of deep reinforcement learning, a Markov decision process (MDP) is defined for this model. Finally, to address the data variations caused by uncertainties, the proximal policy optimization (PPO) algorithm is employed to find solutions. This involves introducing mini?batch updates and importance sampling techniques, which limit the magnitude of policy updates within a certain range, ensuring the accuracy of policy updates in each iteration. The simulation results demonstrate that compared to the deep deterministic policy gradient (DDPG), this method can effectively mitigate the impact of source?load uncertainty while significantly reducing the total carbon emissions and the average daily operating cost of the system.