Skip to content
czxttkl
Category Archives:
Algorithm
Revisit Gaussian kernel
Optimization with discrete random variables
Control Variate
Personalized Re-ranking
Practical considerations of off-policy policy gradient
Constrained RL / Multi-Objective RL
Hash table revisited
TRPO, PPO, Graph NN + RL
Notes on “Recommending What Video to Watch Next: A Multitask Ranking System”
Convergence of Q-learning and SARSA
Posts pagination
Newer posts
1
…
3
4
5
6
7
…
12
Older posts