郦芳菲, 王海龙, 陆子雄, 王忠. 基于人工辅助深度强化学习的交直流混合微电网实时优化调度[J]. 现代电力, 2023, 40(4): 577-586. DOI: 10.19725/j.cnki.1007-2322.2022.0032
引用本文: 郦芳菲, 王海龙, 陆子雄, 王忠. 基于人工辅助深度强化学习的交直流混合微电网实时优化调度[J]. 现代电力, 2023, 40(4): 577-586. DOI: 10.19725/j.cnki.1007-2322.2022.0032
LI Fangfei, WANG Hailong, LU Zixiong, WANG Zhong. Real-Time Optimal Scheduling of AC / DC Hybrid Microgrid Based on Artificial Auxiliary Deep Reinforcement Learning[J]. Modern Electric Power, 2023, 40(4): 577-586. DOI: 10.19725/j.cnki.1007-2322.2022.0032
Citation: LI Fangfei, WANG Hailong, LU Zixiong, WANG Zhong. Real-Time Optimal Scheduling of AC / DC Hybrid Microgrid Based on Artificial Auxiliary Deep Reinforcement Learning[J]. Modern Electric Power, 2023, 40(4): 577-586. DOI: 10.19725/j.cnki.1007-2322.2022.0032

基于人工辅助深度强化学习的交直流混合微电网实时优化调度

Real-Time Optimal Scheduling of AC / DC Hybrid Microgrid Based on Artificial Auxiliary Deep Reinforcement Learning

  • 摘要: 针对交直流混合微电网优化调度中的不确定性建模难和复杂系统难以高效求解等问题,提出了一种通过人工策略引导提高智能体学习效率的人工辅助深度强化学习算法。首先,结合并网状态下混合微电网的需求侧响应特征,构建了最小化成本的优化调度模型。基于马尔科夫决策流程对优化调度过程进行建模,并根据优化调度模型设计奖励函数。然后,采用人工辅助的深度确定性策略梯度算法求解模型,通过智能体和环境的持续交互,不断更新神经网络参数进而得到最优决策。最后通过算例仿真验证了所提算法能有效提高智能体的学习效率,在减少模型训练时间的同时,有效降子系统的运行成本。

     

    Abstract: In allusion to such troubles as difficulty of uncertainty modeling and difficult to solve complex system efficiently in optimal dispatching of AC/DC hybrid microgrid, an artificial assisted deep reinforcement learning algorithm, which could improve the learning efficiency of intelligent agent through artificial strategy guidance, was proposed. Firstly, combining with the characteristic of demand side response of hybrid microgrid under grid-connected state a cost-minimized optimal dispatching model was constructed. Based on Markov decision process the modeling of optimal dispatching process was conducted and based on optimal dispatching model the reward function was designed. Secondly, the designed model was solved by artificially assisted deep deterministic policy gradient algorithm, and by means of continuous interaction between intelligent agent and environment the parameter of neural network was continually updated and then the optimal decision was obtained. Finally, it was verified by computing example that using the proposed algorithm the learning efficiency of intelligent agent could be effectively improved and while the training time of the model was decreased the operating cost of the subsystem could be effectively reduced.

     

/

返回文章
返回