Algorithm – Page 11

Graph Model Recap

I wrote a post before talking about sum-product algorithm: http://maider.blog.sohu.com/307377602.html At that time, I had fresh memory on how graph model works. However, today when I read the post again, I am lost again. Therefore, I want to recap the basic motivations of graph models. This post can serve as an intro to the old post. …

Continue reading “Graph Model Recap”

Find Kth Smallest Element in Two Sorted Arrays

Problem Given K and two sorted arrays with length M and N, find the K-th smallest element among all elements in the two arrays. Code import sys class Solution(object): def findKthSmallest(self, nums1, nums2, k): “”” :type nums1: sorted List[int] :type nusm2: sorted List[int] :type k: int :rtype: int “”” if not (0 <k <= …

Continue reading “Find Kth Smallest Element in Two Sorted Arrays”

Solve Travelling Salesman Problem using DP

Claim In this post, I am using the same notation as in the video below and the wiki page (https://en.wikipedia.org/wiki/Held%E2%80%93Karp_algorithm) Our problem is given a matrix costs in which costs[i][j] is the traveling cost going from node i to node j, we are asked to return the minimum cost for traveling from node 0 and visit …

Continue reading “Solve Travelling Salesman Problem using DP”

Heapsort basic idea

Just recap the basic idea of heapsort (return an ascending order): first you build max heap right on the original array. You don’t need to create an additional O(N) space. Instead, just make sure child node index i is always less than parent node (i – 1)/2 by necessary swaps between index i and index (i-1)/2. …

Continue reading “Heapsort basic idea”

You are almost always getting significant p-value in large dataset

Recently, an exploding reproducibility research paper [1] shows that more than half of past published papers in the main psychological journals can’t be replicated to generate significance findings. This makes me dig a little further on how to conduct statistical tests. Let me show you my little experiments in `R` first. # remove(list=ls()) set.seed(19910715) n1 …

Continue reading “You are almost always getting significant p-value in large dataset”

Minimum Description Length (MDL): a scoring function to learn Bayesian Network Structure

Bayesian Network Augmented Naive-Bayes (BAN) is an improved version of Naive-Bayes, in which every attribute may have at most one other attribute as its parent other than the class variable . For example (figure from [1]): Though BAN provides a more flexible structure compared to Naive Bayes, the structure (i.e., dependence/independence relationships between attributes) needs …

Continue reading “Minimum Description Length (MDL): a scoring function to learn Bayesian Network Structure”

LDA Recap

It’s been half year since last time I studied LDA (Latent Dirichlet Allocation) model. I found learning something without writing them down can be frustrating when you start to review them after some while. Sometimes it is a bit embarrassing that you feel you didn’t learn that at all. Back to the topic, this post …

Continue reading “LDA Recap”

Collection of statistical hypothesis tests

This post is a collection of hypothesis test methodologies. The full collection is listed here: http://www.biostathandbook.com/testchoice.html. My post just goes over several hypothesis tests that are relevant to my research. One-way ANOVA: http://www.biostathandbook.com/onewayanova.html If you have one measurement variable and one nominal variable, and the nominal variable separates subjects into multiple groups, you want to test …

Continue reading “Collection of statistical hypothesis tests”

Factor Analysis and PCA

I’ve been scratching my head for a day (06/24/2015) but it ended up that I am still baffled by the concept of Factor Analysis. So first of all, let me list what I’ve learned so far today. PCA and Factor Analysis can be deemed similar from the perspective that they are both dimension reduction technologies. …

Continue reading “Factor Analysis and PCA”

Learn LSTM in RNN

Long Short Term Memory is claimed to be capable of predicting time series when there are long time lags of unknown sizes between important events. However, as to 2015.6, not many clear tutorials have been found on the Internet. I am going to list a collection of materials I came across. Probably I will write …

Continue reading “Learn LSTM in RNN”