My understanding in Bayesian Optimization

In this post, I am going to share my understanding in Bayesian Optimization. Bayesian Optimization is closely related Gaussian Process, a probabilistic model explained in details in a textbook called “Gaussian Processes for Machine Learning” [1]. In the following, “p. ” will refer to a specific page in [1] where the content comes from.  First, let’s dive into the …

Combinatorial Optimization using Pointer Network (Code Walkthrough)

In this post, I am going to walk through an online piece of code [7] which implements the idea of [1]: using pointer network [2] to solve travelling salesman problem. Pointer networks, in my understanding, are neural network architectures for the problems where output sequences come from the permutation of input sequences. Some background posts …

GLIBC_2.xx not found while installing tensorflow

The error “GLIBC_2.xx not found” is usually seen when someone installs tensorflow on a relatively outdated machine. Solution Install Conda. Details 1. download MiniConda: https://conda.io/miniconda.html 2. install it:  chmod +x Miniconda3-latest-Linux-x86_64.sh ./Miniconda3-latest-Linux-x86_64.sh 3. create a new environment: conda create -n myenv python=3.6 4.  activate it source activate myenv 5. install tensorflow using conda (within the environment myenv) …

Revisit Support Vector Machine

This post reviews the key knowledgement about support vector machine (SVM). The post is pretty much based on a series of the lectures [1]. Suppose we have data , where each point is a -dimensional data point associated with label . The very initial SVM is to come up with a plane such that data …

Revisit Traveling Salesman Problem

Travelling salesman problem (TSP) goes as follows [1]: Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city and returns to the origin city? This problem statement is actually a TSP-OPT problem., where”OPT” stands for optimization. This is not a decision problem …

shadowsocks + SwitchyOmega

We’ve introduced one way to proxy internet: https://czxttkl.com/?p=1265 Now we introduce another way to create proxy, which uses shadowsocks + SwitchyOmega (a chrome extension). Ubuntu: in a terminal:  sudo apt-get install shadowsocks-qt5 sudo add-apt-repository ppa:hzwhuang/ss-qt5 sudo apt-get update sudo apt-get install shadowsocks-qt5 open the installed shadowsocks and config a new connection:  install chrome extension SwitchyOmega: https://www.dropbox.com/s/i5xmrh4wv1fivg7/SwitchyOmega_Chromium.crx?dl=0 config …

Reinforcement Learning in Web Products

Reinforcement learning (RL) is an area of machine learning concerned with optimizing a notion of cumulative rewards. Although it has been applied in video game AI, robotics and control optimization for years, we have seen less of its presence in web products. In this post, I am going to introduce some works that apply RL in …

Questions on Guided Policy Search

I’ve been reading Prof. Sergey Levine‘s paper on Guided Policy Search (GPS) [2]. However, I do not understand about it but want to have a record of my questions so maybe in the future I could look back and solve. Based on my understanding, traditional policy search (e.g., REINFORCE) maximizes the likelihood ratio of rewards. This …

Relationships between DP, RL, Prioritized Sweeping, Prioritized Experience Replay, etc

In the last weekend, I’ve struggled with many concepts in Reinforcement Learning (RL) and Dynamic Programming (DP). In this post, I am collecting some of my thoughts about DP, RL, Prioritized Sweeping and Prioritized Experience Replay. Please also refer to a previous post written when I first learned RL. Let’s first introduce a Markov Decision …