-
Notifications
You must be signed in to change notification settings - Fork 36
Description
Hi there, I noticed that the performance of collapse::fsd() seems to have gone down on a little benchmark I've been running between versions 2.0.15 and 2.0.16. I tried a few combinations of R and and collapse going back to 4.2.1 and 1.8.9 respectively and I think the issue stems from collapse. This is the script I've been testing with:
install.packages(c("bench", "waldo", "remotes"))
remotes::install_version("data.table", "1.15.4")
#remotes::install_version("collapse", "2.0.15")
remotes::install_version("collapse", "2.0.16")
library(data.table); setDTthreads(1)
library(collapse); set_collapse(nthreads = 1)
n = 3e5
set.seed(123)
val_dt = data.table(g = rep(1:n, each = 6),
x = rt(6 * n, 3))
val_dt
dt_f1 = \(val_dt) val_dt[,.(x = sd(x)), by = g]
cl_f1 = \(val_dt) val_dt |> gby(g) |> fsd()
cl_f2 = \(val_dt) val_dt |> gby(g) |> smr(x = fsd(x))
check_fun = \(x,y) length(waldo::compare(x,y, tolerance = 1e-8)) == 0
res = bench::mark(data.table = dt_f1(val_dt),
collapse = cl_f1(val_dt),
collapse2 = cl_f2(val_dt),
check = check_fun)
sessionInfo()
res |>
slt(expression:mem_alloc, n_itr)
On the rocker/r-ver:4.4.2 image with [email protected] that script produces:
# A tibble: 3 × 6
expression min median `itr/sec` mem_alloc n_itr
<bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <int>
1 data.table 34.84ms 37.3ms 22.6 53.9MB 12
2 collapse 9.01ms 12.4ms 81.3 24.3MB 41
3 collapse2 8.9ms 12.1ms 75.0 24.2MB 38
But the same script on the same image with [email protected] yields:
# A tibble: 3 × 6
expression min median `itr/sec` mem_alloc n_itr
<bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <int>
1 data.table 35.1ms 36.6ms 22.9 53.9MB 12
2 collapse 23.8ms 27.4ms 38.0 24.3MB 19
3 collapse2 44.4ms 48.2ms 21.0 24.2MB 11
The plain fsd() call went from 12ms to 27ms, and the smr(x = fsd(x)) went from 12ms to 48ms. A different test with fmean() showed no differences there, but I didn't try any of the other special fast statistical functions outside of those two.
Just thought I'd flag the issue since it seems it hasn't been noticed yet and I didn't see anything the seemed like it could be related in the NEWS for 2.0.16. I wish I could offer more to identify the root issue but it's beyond my capabilities I think.