Follow-Up
A curious colleague pointed out the strongly asserted internet claim that R For Loops and R Apply functions ought to be close peers when it comes to performance. Some example sources: here and here. However, my previous post constructed an experiment which suggested a huge performance gain using an apply()
function over for(i in …)
. I noticed my R For Loop was constructed to mimic the C++ function I had written, and not the R apply()
function. I re-ran the test as before, but this time including two additional alternatives. The new code and results are given below.
New Code
Refer to the initial post for the missing details. Below is a new R For Loop which more closely mimics the behavior of our apply()
procedure.
na_or_0_row_ind = function(r){all(is.na(r)) | all(r == 0)}
stm <- Sys.time()
forloop_rows_v2 <- vector(mode = "logical", length = nrow(df))
for(rr in 1:nrow(df)){forloop_rows_v2[rr] <- na_or_0_row_ind(df[rr,])}
forloop_rows_v2 <- which(forloop_rows_v2)
forloop_time_v2 <- as.numeric(difftime(Sys.time(),stm, units = "secs"))
Below we use vapply()
which requires a transpose operation in our underlying use case. If we could avoid the transpose, we observed slightly improved results over apply()
. Yet with the transpose we had slower performance.
stm <- Sys.time()
df2 <- data.frame(t(df))
vapply_rows <- which(vapply(df2, na_or_0_row_ind, logical(1)))
vapply_time <- as.numeric(difftime(Sys.time(),stm, units = "secs"))
New Results
The below times and time ratios were obtained using Microsoft R Open 3.4.3 (which is the R version tied to the use case). Once again, I only did one run per treatment because….
Rows | Cpp Time | Apply Time | Vapply Time | R Loop (Apply) Time | R Loop (C++) Time |
---|---|---|---|---|---|
6,000 | 0.004 | 0.068 | 0.097 | 2.067 | 3.519 |
60,000 | 0.011 | 0.502 | 0.761 | 27.264 | 36.536 |
600,000 | 0.088 | 3.317 | 5.953 | 913.684 | 1516.065 |
6,000,000 | 0.897 | 38.057 | 60.063 | DNR | DNR |
60,000,000 | 9.946 | 434.300 | 836.376 | DNR | DNR |
Rows | Cpp Ratio | Apply Ratio | Vapply Ratio | R Loop (Apply) Ratio | R Loop (C++) Ratio |
---|---|---|---|---|---|
6,000 | 1.0 | 17.0 | 24.2 | 516.0 | 878.2 |
60,000 | 1.0 | 45.6 | 69.2 | 2476.5 | 3318.8 |
600,000 | 1.0 | 37.6 | 67.6 | 10371.4 | 17209.2 |
6,000,000 | 1.0 | 42.4 | 67.0 | DNR | DNR |
60,000,000 | 1.0 | 43.7 | 84.1 | DNR | DNR |
Summary
The R For Loop did see a performance improvement when coded more in-line with the apply()
operation, yet it was still noticeably slower. Interestingly, our C++ and R Apply operations tend to scale almost linearly, but the For Loops became unstable after 60K rows. I’m curious if this generalizes to other people’s hardware?