Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Package: collapse
Title: Advanced and Fast Data Transformation
Version: 2.1.5.9000
Date: 2025-12-02
Version: 2.1.6
Date: 2025-12-21
Authors@R: c(
person("Sebastian", "Krantz", role = c("aut", "cre"),
email = "[email protected]",
Expand Down
6 changes: 5 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
# collapse 2.1.5.9000
# collapse 2.1.6

* The repo has moved to `fastverse/collapse` and the website to [fastverse.org/collapse](https://fastverse.org/collapse/)---for better visibility and maintenance. Appropriate redirects from the old repo/site have been implemented.
Selected people now have access to the repo through the organization account and may respond to issues or submit fixes.

* Added new AI-generated interactive/chattable [DeepWiki documentation](https://deepwiki.com/fastverse/collapse).

* *collapse* now treats `-0` and `0` as the same value in hash functions (`funique()`, `group()`, `fmatch()`, `fndistinct()`, `fmode()`, and all higher-level derivatives). This is implemented by adding a value of `0.0` to double values before hashing them, and has a small (~3%) performance penalty when hashing doubles. It is implemented in synch with an [equivalent change in *Rcpp*](https://github.com/RcppCore/Rcpp/issues/1340). Thanks @mayer79 for reporting and helping with benchmarking the performance implications (#648).

* Fixed a bug in `pivot(..., how = "wider", FUN = "sum")` (using internal sum function) when columns to aggregate were integer typed. Thanks @ummel (#803).
Expand All @@ -11,6 +13,8 @@

* Faster installation from source thanks to the `#include <Rcpp/Lighter>` option in *Rcpp* which loads only part of the header files. Thanks @eddelbuettel for the hint.

* Consistency with internal updates to *data.table*. Thanks @aitap (https://github.com/fastverse/collapse/pull/809, https://github.com/Rdatatable/data.table/issues/7497).

# collapse 2.1.5

* Fixed small bugs/strange behavior in `collap()` when `g` was passed externally (as columns or *GRP* object). E.g., in `collap(x, g, w = ~ col)`, where `g` is a *GRP* object, the weights were aggregated twice: once using `FUN` (incorrect) and once using `wFUN`.
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
[![dependencies](https://tinyverse.netlify.app/badge/collapse)](https://CRAN.R-project.org/package=collapse)
[![DOI](https://zenodo.org/badge/172910283.svg)](https://zenodo.org/badge/latestdoi/172910283)
[![arXiv](https://img.shields.io/badge/arXiv-2403.05038-0969DA.svg)](https://arxiv.org/abs/2403.05038)
[![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/fastverse/collapse)
<!-- badges: end -->

*collapse* is a large C/C++-based package for data transformation and statistical computing in R. It aims to:
Expand Down
4 changes: 2 additions & 2 deletions src/data.table_subset.c
Original file line number Diff line number Diff line change
Expand Up @@ -104,10 +104,10 @@ static SEXP shallow(SEXP dt, SEXP cols, R_len_t n)
// otherwise (if the SET were were first) the 100 tl is assigned to length.
SET_LEN(newnames,l);
SET_TRULEN(newnames,n);
SET_GROWABLE_BIT(newnames);
SET_GROWBL_BIT(newnames);
SET_LEN(newdt,l);
SET_TRULEN(newdt,n);
SET_GROWABLE_BIT(newdt);
SET_GROWBL_BIT(newdt);
setselfref(newdt);
UNPROTECT(protecti);
return(newdt);
Expand Down
2 changes: 1 addition & 1 deletion src/data.table_utils.c
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ SEXP setnames(SEXP x, SEXP nam) {
memcpy(SEXPPTR(newnam), SEXPPTR_RO(nam), l*sizeof(SEXP));
SET_LEN(newnam, l);
SET_TRULEN(newnam, n);
SET_GROWABLE_BIT(newnam);
SET_GROWBL_BIT(newnam);
setAttrib(x, R_NamesSymbol, newnam);
setselfref(x);
UNPROTECT(1);
Expand Down
6 changes: 6 additions & 0 deletions src/internal/R_defn.h
Original file line number Diff line number Diff line change
Expand Up @@ -195,4 +195,10 @@ typedef union { VECTOR_SEXPREC s; double align; } SEXPREC_ALIGN;
// return DATAPTR(x);
// }

/* Growable vector support */
#undef GROW_MSK
#define GROW_MSK ((unsigned short)(1<<5))
#undef SET_GROWBL_BIT
#define SET_GROWBL_BIT(x) (((x)->sxpinfo.gp) |= GROW_MSK)

#endif // End of R_DEFINITIONS_H guard
18 changes: 14 additions & 4 deletions vignettes/collapse_documentation.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ vignette: >
1. To facilitate complex data transformation, exploration and computing tasks in R.
2. To help make R code fast, flexible, parsimonious and programmer friendly.

Documentation comes in 6 different forms:
Documentation comes in 7 different forms:

## Built-In Structured Documentation

Expand All @@ -30,17 +30,27 @@ The package page under `help("collapse-package")` provides some general informat

Reading `help("collapse-package")` and `help("collapse-documentation")` is the most comprehensive way to get acquainted with the package. `help("collapse-documentation")` is always the most up-to-date resource.

## DeepWiki

[DeepWiki](https://deepwiki.com/) is an AI-powered platform designed to automatically generate structured, interactive documentation for software repositories, primarily on GitHub. Developed by [Cognition AI](https://cognition.ai/)—the same laboratory behind the autonomous AI engineer [Devin](https://devin.ai/)—it serves as a dynamic, "Wikipedia-like" encyclopedia for codebases.

While not more comprehensive or accurate than the structured documentation, it is great to learn more about the internal structure of *collapse* and use a chatbot (Devin) to ask questions about or write code using *collapse*.

You can access the *collapse* DeepWiki [here](https://deepwiki.com/fastverse/collapse).

## Cheatsheet

An up-to-date (v2.0) [cheatsheet](<https://raw.githubusercontent.com/fastverse/collapse/master/misc/collapse%20cheat%20sheet/collapse_cheat_sheet.pdf>) compactly summarizes the package.

## Article on arXiv

An [article](https://arxiv.org/abs/2403.05038) on *collapse* is forthcoming at [Journal of Statistical Software](https://www.jstatsoft.org/).
An [article](https://arxiv.org/abs/2403.05038) on *collapse* is forthcoming at [Journal of Statistical Software](https://www.jstatsoft.org/) in early 2026.

## Presentations and Slides

## useR 2022 Presentation and Slides
- I have presented *collapse* (v1.8) in some level of detail at useR 2022. A 2h video recording that provides a quite comprehensive introduction is available [here](<https://www.youtube.com/watch?v=OwWT1-dSEts>). The corresponding slides are available [here](<https://raw.githubusercontent.com/fastverse/collapse/master/misc/useR2022%20presentation/collapse_useR2022_final.pdf>).

I have presented collapse (v1.8) in some level of detail at useR 2022. A 2h video recording that provides a quite comprehensive introduction is available [here](<https://www.youtube.com/watch?v=OwWT1-dSEts>). The corresponding slides are available [here](<https://raw.githubusercontent.com/fastverse/collapse/master/misc/useR2022%20presentation/collapse_useR2022_final.pdf>).
- I have recently presented *collapse* (v2.1) and the *fastverse* at a workshop on "[Speeding Up Empirical Research: Tools and Techniques for Fast Computing](https://www.bportugal.pt/en/evento/workshop-speeding-empirical-research-tools-and-techniques-fast-computing-bplim)" organized by the Bank of Portugal in December 2025. My 45-minute talk focused on two advanced applications in international trade and spatial network analysis/package development. You can find the materials (slides and recording) [here](https://github.com/BPLIM/Workshops/tree/master/BPLIM2025).

## Vignettes

Expand Down