BFGS and L-BFGS materials

Good explanation for BFGS and L-BFGS can be found in this textbook: https://www.dropbox.com/s/qavnl2hr170njbd/NumericalOptimization2ndedJNocedalSWright%282006%29.pdf?dl=0 In my own words: Gradient descent and its various variants have been proved to learn parameters well in practical unconstrained optimization problems. However, it may converge too slowly or too roughly depending on how you set learning rate. Remember the optimum in unconstrained …

Recommendation System Review

Traditional Collaborative Filtering each user is represented by a O(N) length vector. N is the number of items. Therefore, user-item matrix has size O(MN). M is the number of users. for a user we want to recommend items for, we scan through user-item matrix, and find most similar k users. Based on the item vectors …

My Lempel-Ziv Compressor

Lempel-Ziv algorithm is a widely known compression algorithm. Its compression rate is proved to asymptotically reach the entropy of per symbol in the sequence to be compressed, when the length of the sequence is long enough.i.e., $latex \frac{\text{compressed bits}}{\text{length of symbols } X_1 \cdots X_n } = H(X) &s=2$, where $latex H(X)$ is per symbol …