tricks in deep learning neural network

In this post, I am going to talk my understanding in tricks in training deep neural network. ResNet [1] Why does ResNet network work? https://www.quora.com/How-does-deep-residual-learning-work Here is my answer: It is hard to know the desired depth of a deep network. If layers are too deep, errors are hard to propagate back correctly. if layers are …

A3C code walkthrough

In this post, I am doing a brief code walkthrough for the code written in https://medium.com/emergent-future/simple-reinforcement-learning-with-tensorflow-part-8-asynchronous-actor-critic-agents-a3c-c88f72a5e9f2 The code implements A3C algorithm (Asynchronous Methods for Deep Reinforcement Learning). It follows the pseudocode given in supplemental part in the paper: The structure of this model is: For LSTM structure detail, refer to http://colah.github.io/posts/2015-08-Understanding-LSTMs/. I am using the same …

Policy Gradient

Reinforcement learning algorithms can be divided into many families. In model-free temporal difference methods like Q-learning/SARSA, we try to learn action value for any state-action pair, either by recording (“memorizing”) exact values in a tabular or learning a function to approximate it. Under -greedy, the action to be selected at a state will therefore be  but there …

Upgrade Cuda from 7.x to 8.0 on Ubuntu

1.  remove cuda 7.x version (x depends on what you installed.) rm /usr/local/cuda-7.x 2. make sure PATH and LD_LIBRARY_PATH no longer contain “/usr/local/cuda-7.x”. Possible places to look at are /etc/environment, ~/.profile, /etc/bash.bashrc, /etc/profile, ~/.bash_rc If you really don’t know where cuda path is added to PATH or LD_LIBRARY_PATH, try to check here: https://unix.stackexchange.com/questions/813/how-to-determine-where-an-environment-variable-came-from 3. cuda 8.0 …

English Grammars

“A” or “an” before an acronym or abbreviation? e.g., a FAQ or an FAQ? https://english.stackexchange.com/questions/1016/do-you-use-a-or-an-before-acronyms   When should I add “the” before what kind of noun? http://www.englishteachermelanie.com/grammar-when-not-to-use-the-definite-article/   Whether to repeat “the” in “noun and noun” phrases? http://english.stackexchange.com/questions/9487/is-it-necessary-to-use-the-multiple-times   “noun and noun” phrase: the following verb is plural or single? http://www.mhhe.com/mayfieldpub/tsw/nounsagr.htm   adj before “noun …

Inverse Reinforcement Learning

In my rough understanding, inverse reinforcement learning is a branch of RL research in which people try to perform state-action sequences resembling given tutor sequences. There are two famous works on inverse reinforcement learning. One is Apprenticeship Learning via Inverse Reinforcement Learning [1], and the other is Maximum Margin Planning [2]. Maximum Margin Planning In …