Reinforcement learning overview

Here are some materials I found useful to learn Reinforcement Learning (RL). Let’s first look at Markov Decision Process (MDP), in which you know a transition function $latex T(s,a,s’)$ and a reward function $latex R(s,a,s’)$. In the diagram below, the green state is called “q state”.  Some notations that need to be clarified: Dynamic programming …

Abstract Algebra

I am introducing some basic definitions of abstract algebra, structures like monoid, groups, rings, fields and vector spaces and homomorphism/isomorphism. I find the clear definitions of structures from [1]: Also, the tables below show a clear comparisons between several structures [2,3]:   All these structures are defined with both a set and operation(s). Based on [4], …

When A* algorithm returns optimal solution

Dijkstra algorithm is a well known algorithm for finding exact distance from a source to a destination. In order to improve the path finding speed, A* algorithm combines heuristics and known distances to find the heuristically best path towards a goal. A common A* implementation maintains an open set for discovered yet not evaluated nodes and a closed …

Install Tensorflow 0.12 with GPU support on AWS p2 instance

# for connection and file transfer ssh -i ~/Dropbox/research/aws_noisemodel_keypair.pem ubuntu@ec2-54-164-130-227.compute-1.amazonaws.com rsync –progress –delete -rave “ssh -i /home/czxttkl/Dropbox/research/aws_noisemodel_keypair.pem” /home/czxttkl/workspace/mymachinelearning/Python/LoLSynergyCounter ubuntu@ec2-54-164-130-227.compute-1.amazonaws.com:~/ sudo apt-get install python-pip python-dev pip install tensorflow-gpu   download and transfer cuda toolkit, then install  sudo dpkg -i cuda-repo-ubuntu1604-8-0-local_8.0.44-1_amd64.deb sudo apt-get update sudo apt-get install cuda   download and transfer cudnn, then install: tar xvzf cudnn-<your-version>.tgz sudo …

Install Python Package for User

If you do not have root privilege and want to install a python module, you can try the following approach:   python setup.py install –user This will install packages into subdirectories of site.USER_BASE. To check what is the value of site.USER_BASE, use:   import site print site.USER_BASE reference: https://docs.python.org/2/install/   Update 2018/01/06: using pip to …

Embedding and Heterogeneous Network Papers

Embedding methods have been widely used in graph, network, NLP and recommendation system. In short, embedding methods vectorize entities under study by mapping them into a shared latent space. Once vectorized representation of entities are learned (through either supervised or unsupervised fashion), a lot of knowledge discovery work can be done: clustering based on entity …

The expected times of tosses until you see first HTH or HTT

The problem comes from a very famous Ted Talk:  You are flipping a fair coin. What is the expected times of tosses you need to see the first “HTH” appears? What is that for the first “HTT” appears? Suppose $latex N_1$ is the random variable which counts the number of flips till we get first …

Leetcode 31: Next Permutation

31. Next Permutation Total Accepted: 87393 Total Submissions: 313398 Difficulty: Medium Contributors: Admin Implement next permutation, which rearranges numbers into the lexicographically next greater permutation of numbers. If such arrangement is not possible, it must rearrange it as the lowest possible order (ie, sorted in ascending order). The replacement must be in-place, do not allocate …