Download and process Chinese songs from Youtube

This posts introduces the way to download Chinese songs from a playlist on youtube and process titles of songs. I use youtube-dl to download all songs from a playlist (replace the youtube link with your own, make sure the playlist is public): youtube-dl -i –yes-playlist -x –audio-format mp3 -o “%(title)s.%(ext)s” –audio-quality 0 “https://www.youtube.com/watch?v=4V3hxNyiwaA&index=1&list=PL-VzXmWCFX7iz_hxy6Xb-JXZFs4GGKMdG” Update 2024-1-26: …

Install Google Pinyin on Ubuntu

Just want to document the procedure to install Google Pinyin on Ubuntu (tested on 16.04): Command line: sudo apt-get install fcitx-googlepinyin2. System settings -> Language support -> Keyboard input method system, change to fcitx.3. Log out log in4. At top right, click the penguin icon -> Text entry setting5. Click +6. Search ‘Google’, find ‘Google …

How to conduct grid search

I have always had some doubts on grid search. I am not sure how I should conduct grid search for hyperparameter tuning for a model and report the model’s generalization performance for a scientific paper. There are three possible ways: 1)  Split data into 10 folds. Repeat 10 times of the following: pick 9 folds as training data, …

Monte Carlo Tree Search Overview

Monte Carlo Tree Search (MCTS) has been successfully applied in complex games such as Go [1]. In this post, I am going to introduce some basic concepts of MCTS and its application. MCTS is a method for finding optimal decisions in a given domain by taking random samples in the decision space and building a …

My understanding in Cross Entropy Method

Cross Entropy (CE) method is a general Monte Carlo method originally proposed to estimate rare-event probabilities but then naturally extended to solve optimization problems. It is relevant to several my previous posts. For example, both Bayesian Optimization [5] and CE method can be used to solve black-box optimization problems, although Bayesian Optimization mostly works on continuous input …

My understanding in Bayesian Optimization

In this post, I am going to share my understanding in Bayesian Optimization. Bayesian Optimization is closely related Gaussian Process, a probabilistic model explained in details in a textbook called “Gaussian Processes for Machine Learning” [1]. In the following, “p. ” will refer to a specific page in [1] where the content comes from.  First, let’s dive into the …

Combinatorial Optimization using Pointer Network (Code Walkthrough)

In this post, I am going to walk through an online piece of code [7] which implements the idea of [1]: using pointer network [2] to solve travelling salesman problem. Pointer networks, in my understanding, are neural network architectures for the problems where output sequences come from the permutation of input sequences. Some background posts …

GLIBC_2.xx not found while installing tensorflow

The error “GLIBC_2.xx not found” is usually seen when someone installs tensorflow on a relatively outdated machine. Solution Install Conda. Details 1. download MiniConda: https://conda.io/miniconda.html 2. install it:  chmod +x Miniconda3-latest-Linux-x86_64.sh ./Miniconda3-latest-Linux-x86_64.sh 3. create a new environment: conda create -n myenv python=3.6 4.  activate it source activate myenv 5. install tensorflow using conda (within the environment myenv) …

Revisit Support Vector Machine

This post reviews the key knowledgement about support vector machine (SVM). The post is pretty much based on a series of the lectures [1]. Suppose we have data , where each point is a -dimensional data point associated with label . The very initial SVM is to come up with a plane such that data …

Revisit Traveling Salesman Problem

Travelling salesman problem (TSP) goes as follows [1]: Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city and returns to the origin city? This problem statement is actually a TSP-OPT problem., where”OPT” stands for optimization. This is not a decision problem …