R sapply

Background

In my memory, `sapply` is a function that takes a vector to consume and returns  another vector as result. Today I am sharing a “bizarre” behavior of it. Later I will talk about the reason to account for this weird behavior.

 

Details

 Let’s first look at the following line, from which I scratched my head:

sapply(1:30, function(x) {if(x>15) 1})  

Continue reading “R sapply”

如何删除Git历史记录里较大的文件?

背景

Git历史记录中,若不小心加入了较大的而无用的文件,很可能会让你的repository的size变得很大。Github和Bitbucket都分别对repository的大小做出了限制 (见https://help.github.com/articles/what-is-my-disk-quota/ 及https://confluence.atlassian.com/pages/viewpage.action;jsessionid=735235A3CE151FB6D4C518F3971FD524.node2?pageId=273877699)。那么如何在Git的历史记录中找出较大的文件并且删除它们呢?

  Continue reading “如何删除Git历史记录里较大的文件?”

Everything about Git

注释:本文中,中括号内参数表示可选参数,尖括号内参数表示必选参数。SHA-1值可以是commit的SHA-1值,HEAD[~|^]或者branch名字,因为HEAD和branch名字的SHA-1值都被git所记录。

一般规则

从远端代码仓库中下载项目

git clone

 

在master分支的基础上新建自己的branch,并跳到该branch上:

git checkout -b [origin/master]

  Continue reading “Everything about Git”

How is Logistic Regression designed?

背景

Logistic Regression是ML中再熟悉不过的Model了,它能基于数据X,得出生成binary label的概率:

image1

(在上式中,X仅有一个feature)

假设你出生在Logistic Regression被发明之前且在Normal Linear Regression被发明之后,现在让你设计一个Model来预测Binary的label——Y,使得这个Model能够基于观测数据X得出Y。你会怎么设计呢?

(注:接下来我们都假设X仅有一个feature)

Continue reading “How is Logistic Regression designed?”