为了提高光网络对大规模、差异化电力业务的资源分配能力，降低大规模业务的算法训练时间，提出了一种基于深度确定性策略梯度算法(Multi-agent Deep Deterministic Policy Gradient, MADDPG)的智能电网光网络资源分配方案。该方案考虑大规模和差异化电力业务，建立智能电网光核心网络切片模型，提出最大化电网公司收益为目标的优化问题。本文提出条件判断映射，实现优化问题的简化。同时，本文改进MADDPG算法，通过部署不同业务到不同智能体中进行运算，降低训练时间，满足网络实时性需求。最后，仿真结果表明，该算法具有更大的奖励、更低的成本和时延，并且具有更低的训练时间。
In order to improve the resource allocation ability of optical network for massive and differentiated power services and reduce the algorithm training time of large-scale services, smart grid optical network resource allocation scheme based on MADDPG. The large-scale and differentiated power services were considered, the optical core network slice model of smart grid was built and the optimization problem aiming at maximizing the income of power grid companies was proposed. Conditional judgment mapping is proposed to simplify the optimization problem. At the same time, the improved MADDPG algorithm was designed to reduce the training time and meet the real-time needs of the network by placing different services to different agents. Lastly, simulation results show that the proposed algorithm has better reward, lower cost and delay, and lower training time.