czxttkl – Page 12

Inverse Reinforcement Learning

In my rough understanding, inverse reinforcement learning is a branch of RL research in which people try to perform state-action sequences resembling given tutor sequences. There are two famous works on inverse reinforcement learning. One is Apprenticeship Learning via Inverse Reinforcement Learning [1], and the other is Maximum Margin Planning [2]. Maximum Margin Planning In …

Continue reading “Inverse Reinforcement Learning”

Reinforcement learning overview

Here are some materials I found useful to learn Reinforcement Learning (RL). Let’s first look at Markov Decision Process (MDP), in which you know a transition function $latex T(s,a,s’)$ and a reward function $latex R(s,a,s’)$. In the diagram below, the green state is called “q state”. Some notations that need to be clarified: Dynamic programming …

Continue reading “Reinforcement learning overview”

Abstract Algebra

I am introducing some basic definitions of abstract algebra, structures like monoid, groups, rings, fields and vector spaces and homomorphism/isomorphism. I find the clear definitions of structures from [1]: Also, the tables below show a clear comparisons between several structures [2,3]: All these structures are defined with both a set and operation(s). Based on [4], …

Continue reading “Abstract Algebra”

My Pycharm keymap setting backup

Just to back up my keymap setting because I think they are super convenient to my usage style. https://www.dropbox.com/s/m14ozngs59jcd18/pycharm_keymap_settings.jar?dl=0

When A* algorithm returns optimal solution

Dijkstra algorithm is a well known algorithm for finding exact distance from a source to a destination. In order to improve the path finding speed, A* algorithm combines heuristics and known distances to find the heuristically best path towards a goal. A common A* implementation maintains an open set for discovered yet not evaluated nodes and a closed …

Continue reading “When A* algorithm returns optimal solution”

Install Tensorflow 0.12 with GPU support on AWS p2 instance

# for connection and file transfer ssh -i ~/Dropbox/research/aws_noisemodel_keypair.pem ubuntu@ec2-54-164-130-227.compute-1.amazonaws.com rsync –progress –delete -rave “ssh -i /home/czxttkl/Dropbox/research/aws_noisemodel_keypair.pem” /home/czxttkl/workspace/mymachinelearning/Python/LoLSynergyCounter ubuntu@ec2-54-164-130-227.compute-1.amazonaws.com:~/ sudo apt-get install python-pip python-dev pip install tensorflow-gpu download and transfer cuda toolkit, then install sudo dpkg -i cuda-repo-ubuntu1604-8-0-local_8.0.44-1_amd64.deb sudo apt-get update sudo apt-get install cuda download and transfer cudnn, then install: tar xvzf cudnn-<your-version>.tgz sudo …

Continue reading “Install Tensorflow 0.12 with GPU support on AWS p2 instance”

gmail filter OR condition

In order to achieve a filter rule with OR operation, you must specify conditions connected by the keyword “OR” (must be in upper cases). For example:

Install Python Package for User

If you do not have root privilege and want to install a python module, you can try the following approach: python setup.py install –user This will install packages into subdirectories of site.USER_BASE. To check what is the value of site.USER_BASE, use: import site print site.USER_BASE reference: https://docs.python.org/2/install/ Update 2018/01/06: using pip to …

Continue reading “Install Python Package for User”

Embedding and Heterogeneous Network Papers

Embedding methods have been widely used in graph, network, NLP and recommendation system. In short, embedding methods vectorize entities under study by mapping them into a shared latent space. Once vectorized representation of entities are learned (through either supervised or unsupervised fashion), a lot of knowledge discovery work can be done: clustering based on entity …

Continue reading “Embedding and Heterogeneous Network Papers”

The expected times of tosses until you see first HTH or HTT

The problem comes from a very famous Ted Talk: You are flipping a fair coin. What is the expected times of tosses you need to see the first “HTH” appears? What is that for the first “HTT” appears? Suppose $latex N_1$ is the random variable which counts the number of flips till we get first …

Continue reading “The expected times of tosses until you see first HTH or HTT”