czxttkl – Page 9

My understanding in 401K

Here is my reasoning about 401K. First, I’ll start with two definitions: (1) taxable income, meaning the gross income you receive on which your tax will be calculate; (2) tax deduction, meaning any deduction from your taxable income. Tax deduction lowers your taxable income thus lowers your tax in general. 401K has three categories: Pre-tax: contribute …

Continue reading “My understanding in 401K”

DPG and DDPG

In this post, I am sharing my understanding regarding Deterministic Policy Gradient Algorithm (DPG) [1] and its deep-learning version (DDPG) [2]. We have introduced policy gradient theorem in [3, 4]. Here, we briefly recap. The objective function of policy gradient methods is: $latex J(\theta)=\sum\limits_{s \in S} d^\pi(s) V^\pi(s)=\sum\limits_{s \in S} d^\pi(s) \sum\limits_{a \in A} \pi(a|s) Q^\pi(s,a), &s=2$ where …

Continue reading “DPG and DDPG”

LSTM + DQN

Sequential decision problems can usually be formatted as Markov Decision Problems (MDPs), where you define states, actions, rewards and transitions. In some practical problems, states can just be described by action histories. For example, we’d like to decide notification delivery sequences for a group of similar users to maximize their accumulated clicks. We define two …

Continue reading “LSTM + DQN”

DQN + Double Q-Learning + OpenAI Gym

Here I am providing a script to quickly experiment with the openai gym environment: https://github.com/czxttkl/Tutorials/tree/master/experiments/lunarlander. The script has the features of both Deep Q-Learning and Double Q-Learning. I ran my script to benchmark one open ai environment LunarLander-v2. The most stable version of the algorithm has following hyperparameters: no double q-learning (just use one q-network), gamma=0.99, batch size=64, learning …

Continue reading “DQN + Double Q-Learning + OpenAI Gym”

Notes on “Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor”

I am reading this paper (https://arxiv.org/abs/1801.01290) and wanted to take down some notes about it. Introduction Soft Actor-Critic is a special version of Actor-Critic algorithms. Actor-Critic algorithms are one kind of policy gradient methods. Policy gradient methods are different than value-based methods (like Q-learning), where you learn Q-values and then infer the best action to …

Continue reading “Notes on “Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor””

Notes on Glicko paper

This weekend I just read again the Glicko skill rating paper [1] but I found something not very clear in the paper. I’d like to make some notes, some based on my guesses. Hope I’d sort them out completely in the future. First, Glicko models game outcomes by the Bradley-Terry model, meaning that the win …

Continue reading “Notes on Glicko paper”

Euler’s Formula and Fourier Transform

Euler’s formula states that . When , the formula becomes known as Euler’s identity. An easy derivation of Euler’s formula is given in [3] and [5]. According to Maclaurin series (a special case of taylor expansion when ), Therefore, replacing with , we have By Maclaurin series, we also have Therefore, we can …

Continue reading “Euler’s Formula and Fourier Transform”

Download and process Chinese songs from Youtube

This posts introduces the way to download Chinese songs from a playlist on youtube and process titles of songs. I use youtube-dl to download all songs from a playlist (replace the youtube link with your own, make sure the playlist is public): youtube-dl -i –yes-playlist -x –audio-format mp3 -o “%(title)s.%(ext)s” –audio-quality 0 “https://www.youtube.com/watch?v=4V3hxNyiwaA&index=1&list=PL-VzXmWCFX7iz_hxy6Xb-JXZFs4GGKMdG” Update 2024-1-26: …

Continue reading “Download and process Chinese songs from Youtube”

Install Google Pinyin on Ubuntu

Just want to document the procedure to install Google Pinyin on Ubuntu (tested on 16.04): Command line: sudo apt-get install fcitx-googlepinyin2. System settings -> Language support -> Keyboard input method system, change to fcitx.3. Log out log in4. At top right, click the penguin icon -> Text entry setting5. Click +6. Search ‘Google’, find ‘Google …

Continue reading “Install Google Pinyin on Ubuntu”

How to conduct grid search

I have always had some doubts on grid search. I am not sure how I should conduct grid search for hyperparameter tuning for a model and report the model’s generalization performance for a scientific paper. There are three possible ways: 1) Split data into 10 folds. Repeat 10 times of the following: pick 9 folds as training data, …

Continue reading “How to conduct grid search”