R sapply

Background

In my memory, `sapply` is a function that takes a vector to consume and returns  another vector as result. Today I am sharing a “bizarre” behavior of it. Later I will talk about the reason to account for this weird behavior.

 

Details

 Let’s first look at the following line, from which I scratched my head:

sapply(1:30, function(x) {if(x>15) 1})  

The result is as follows:

[[1]]
NULL

[[2]]
NULL

[[3]]
NULL

[[4]]
NULL

[[5]]
NULL

[[6]]
NULL

[[7]]
NULL

[[8]]
NULL

[[9]]
NULL

[[10]]
NULL

[[11]]
NULL

[[12]]
NULL

[[13]]
NULL

[[14]]
NULL

[[15]]
NULL

[[16]]
[1] 1

[[17]]
[1] 1

[[18]]
[1] 1

[[19]]
[1] 1

[[20]]
[1] 1

[[21]]
[1] 1

[[22]]
[1] 1

[[23]]
[1] 1

[[24]]
[1] 1

[[25]]
[1] 1

[[26]]
[1] 1

[[27]]
[1] 1

[[28]]
[1] 1

[[29]]
[1] 1

[[30]]
[1] 1

The result is a list, rather than a vector. 

Then, I tried a more graceful way to write that embedded function:

sapply(1:30, function(x) {ifelse(x>15, 1, NA)})

 This time the result looks as expected:

 [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1

The reason I interpret such discrepancy is that when the `if` statement was not followed by a `else` statement, the function returns a list containing only one `NULL` for `else` condition as default. So in the first code, `sapply` actually returned 15 lists of `NULL` and 15 `1`s.  In the final step of collecting results, `sapply` automatically returns a collection that incorporates all single result. That’s why  the first code returns a list as a result. In the second code, `NA` is seen as a vector element. That being said, in the second code `sapply` collects 30 vector elements, 15 `NA`s and 15 `1`s. As a result, it returns a vector as I expected.

What if replacing `NA` with `NULL` in the second code?

 sapply(1:30, function(x) {ifelse(x>15, 1, NULL)})

 It turns out RStudio will halt due to a error:

 Error in ifelse(x > 15, 1, NULL) : replacement has length zero In addition: Warning message:
In rep(no, length.out = length(ans)) :
  'x' is NULL so the result will be NULL

The reason I account for this is that `NA` can be thought as a placeholder while `NULL` can’t. `NULL` can’t be coerced into any format automatically therefore the error is thrown.

 

 

Leave a comment

Your email address will not be published. Required fields are marked *