I am reading this paper (https://arxiv.org/abs/1801.01290) and wanted to take down some notes about it. Introduction Soft Actor-Critic is a special version of Actor-Critic algorithms. Actor-Critic algorithms are one kind of policy gradient methods. Policy gradient methods are different than value-based methods (like Q-learning), where you learn Q-values and then infer the best action to …
Monthly Archives: October 2018
Notes on Glicko paper
This weekend I just read again the Glicko skill rating paper [1] but I found something not very clear in the paper. I’d like to make some notes, some based on my guesses. Hope I’d sort them out completely in the future. First, Glicko models game outcomes by the Bradley-Terry model, meaning that the win …
Euler’s Formula and Fourier Transform
Euler’s formula states that $latex e^{ix} =\cos{x}+ i \sin{x}$. When $latex x = \pi$, the formula becomes $latex e^{\pi} = -1$ known as Euler’s identity. An easy derivation of Euler’s formula is given in [3] and [5]. According to Maclaurin series (a special case of taylor expansion $latex f(x)=f(a)+f'(a)(x-a)+\frac{f”(a)}{2!}(x-a)^2+\cdots$ when $latex a=0$), $latex e^x=1+x+\frac{x^2}{2!}+\frac{x^3}{3!}+\frac{x^4}{4!}+\cdots &s=2$ …