Relationships between DP, RL, Prioritized Sweeping, Prioritized Experience Replay, etc

In the last weekend, I’ve struggled with many concepts in Reinforcement Learning (RL) and Dynamic Programming (DP). In this post, I am collecting some of my thoughts about DP, RL, Prioritized Sweeping and Prioritized Experience Replay. Please also refer to a previous post written when I first learned RL. Let’s first introduce a Markov Decision …