Notes on “Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor”

I am reading this paper (https://arxiv.org/abs/1801.01290) and wanted to take down some notes about it. Introduction Soft Actor-Critic is a special version of Actor-Critic algorithms. Actor-Critic algorithms are one kind of policy gradient methods. Policy gradient methods are different than value-based methods (like Q-learning), where you learn Q-values and then infer the best action to …

Euler’s Formula and Fourier Transform

Euler’s formula states that . When , the formula becomes known as Euler’s identity. An easy derivation of Euler’s formula is given in [3] and [5]. According to Maclaurin series (a special case of taylor expansion when ), Therefore, replacing with , we have   By Maclaurin series, we also have   Therefore, we can …