Transmission network topology control offers cheap flexibility to system operators for mitigating grid congestion. However, finding the optimal sequence of topology actions remains a challenge due to the large number of possible actions. Although reinforcement learning (RL) appro
...
Transmission network topology control offers cheap flexibility to system operators for mitigating grid congestion. However, finding the optimal sequence of topology actions remains a challenge due to the large number of possible actions. Although reinforcement learning (RL) approaches have attracted interest for long-term planning in large combinatorial action spaces, they encounter challenges such as training stability, sample efficiency, and unforeseen consequences of RL actions. Addressing these challenges, this paper proposes a hybrid curriculum-trained RL and Monte Carlo tree search (MCTS) approach to determine sequential topological actions for mitigating grid congestion. The curriculum-based approach stabilizes training by first pre-training a policy network through supervised imitation learning, followed by RL training. The policy network guides the MCTS to simulate promising future trajectories, mitigating unforeseen consequences and identifying long-term strategies to improve grid security. Moreover, the MCTS-verified actions are used for RL training, enhancing sample efficiency and training time. A distance factor is added to the MCTS, which improves convergence by prioritizing actions closer to congestion. Numerical results on the IEEE 118-bus system show that the proposed hybrid approach improves the timesteps survived by 30% compared to a standard RL approach, and by 5% compared to a brute-force baseline. Additionally, the inclusion of the distance factor increases the timesteps survived by 15%. These results highlight the potential of the proposed method for real-world applications of using sequential topological actions to effectively relieve grid congestion.
@en