Use `rsync` with `-n` (dry run) so that it lists diff files but will not execute syncing. rsync -n -avcr –delete local_folder/ username@domain:remote_folder/
Author Archives: czxttkl
Overview for Sequential Data Learning
Hidden Markov Model You should bear in mind clearly the three questions people usually ask for Hidden Markov Model: 1. what is the probability of an observed sequence? 2. what is the most likely series of states given a specific observed observation? 3. Given a set of observations, what are the values of the state …
Right way to put test codes in a Python project
I’ve been struggled about where to put test files in a python project for a long time. Ideally, I think it is succinct to create a folder called “test” with all test files in it. However, the test files nested in the test folder need to import modules from parent folder. It is troublesome to import Python module …
Continue reading “Right way to put test codes in a Python project”
Jupyter Parallelism Tutorial
In this post, I am going to introduce my favorite way to make cells in Jupyter notebook run in parallel. 1. Initialize cluster using command lines or use Python `popen` (In the example below, I create a cluster with 2 workers): from subprocess import Popen p = Popen([‘ipcluster’, ‘start’, ‘-n’, ‘2’]) 2. Then, programmatically set …
Convert Latex To Word
The best way to convert a tex file to docx is to: use Mircosoft Office Word 2013 or later to open the pdf file generated by .tex file. Word will convert pdf to docx with pretty format.
Add permanent key bindings for Jupyter Notebook
This post shows how to add customized permanent key bindings for jupyter notebook. 1. check the location of your jupyter config folder using the command: sudo ~/.local/bin/jupyter –config-dir I am running Ubuntu. The config folder, by default is, `/home/your_user_name/.jupyter` 2. Create a folder `custom` in the config folder. 3. Create a file `custom.js` in the …
Continue reading “Add permanent key bindings for Jupyter Notebook”
Adaptive Regularization of Weight Vectors — Mathematical derivation
In this post I am showing my understanding about the paper Adaptive Regularization of Weight Vectors: http://papers.nips.cc/paper/3848-adaptive-regularization-of-weight-vectors.pdf The paper aims to address the negatives of a previous algorithm called confidence weighted (CW) learning by introducing the algorithm Adaptive Regularization Of Weights (AGOW). CW and AGOW are both online learning algorithms, meaning updates happen after observing each …
Continue reading “Adaptive Regularization of Weight Vectors — Mathematical derivation”
Math Derivation for Bayesian Probabilistic Matrix Factorization Model
In this paper I am trying to derive the mathematical formulas that appear in the paper Bayesian Probabilistic Matrix Factorization using Markov Chain Monte Carlo. We will use exactly the same notations as in the paper. The part that I am interested in is the Inference part (Section 3.3). Sample $latex U_i$ In Gibbs sampling, …
Continue reading “Math Derivation for Bayesian Probabilistic Matrix Factorization Model”
Estimate best K for K-Means in parallel
Gap statistic is often used to determine the best number of clusters. Please see a local version implementation for gap statistic here: https://github.com/echen/gap-statistic. It is often desired to parallelize such tedious job to boost the speed. I implement a parallelized version basd on the source code: library(plyr) library(ggplot2) # Calculate log(sum_i(within-cluster_i sum of squares around cluster_i …
How to make multi-lined cells in table in Latex?
Use package `makecell`. http://tex.stackexchange.com/a/176780/76654