Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,6 @@ vignettes/*.pdf
# virtual environments
renv*
rv*
rproject.toml

# ignore duckdb database files and parquet
*db
Expand Down
84 changes: 84 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
# Development instructions

If you are a human, take these instructions as guidance.
If you are an agent, follow these instructions.
Ask for clarifications where specifications are unclear or edge cases crop up.

This is an R package that is used for land use / land cover change analyses and simulations.
It builds on the approaches of [lulcc](https://github.com/simonmoulds/lulcc), [clumpy](https://github.com/mmyrte/clumpy), and [Dinamica EGO](https://dinamicaego.com/).

## Project layout

- `R/parquet_db.R`, `R/parquet_db_utils.R`: generic parquet-backed DB layer
- `R/evoland_db.R`, `R/evoland_db_util.R`, `R/evoland_db_views.R`: domain DB and views (inherits parquet_db); register new methods using `create_method_binding` in `evoland_db` constructor
- `R/*_t.R`: table type definitions (`as_*_t` constructors)
- `R/util*.R`: internal helpers
- `R/*.R` (remaining): domain logic (allocation, filtering, modelling)
- `src/`: Rcpp / C++ code
- `inst/tinytest/`: tests
- `inst/dinamica_models/`: Dinamica EGO models
- `inst/*.sql`: SQL views; use glue for string interpolation
- `data-raw/`: raw data and test fixture generation scripts
- `man/`: documentation generated by roxygen2
- `vignettes/*.qmd`: vignettes and how-tos

## Environment setup

Use [rv](https://a2-ai.github.io/rv-docs/). Run `rv sync` to install all dependencies (see `rproject.toml`).
Recompile and attach the package using `pkgload::load_all()`.

## Conventions and Style

Prefer base R solutions over external dependencies where they are equally readable.
Avoid new dependencies.
Avoid packages from the tidyverse, as their APIs tend to be unstable.
Avoid niche packages that are seldom maintained; if the functionality is simple, rather implement it as a non-exported utility function.

Prefer exact double bracket `[["name"]]` list subsetting over `$name`, which does partial matching.

Use `air` as autoformatter (`./air.toml`).
Use `lintr` for linting (`./.lintr`).
If running in an IDE, rely on language server notes.
Prefer explanatory names over comments in your code.

Let errors propagate naturally; only add early guards when the default error message would confuse the user.
Use the `stopifnot("error message" = condition)` pattern for assertions.

## Testing

Use tinytest, see `inst/tinytest`.
Non-exported functions need to be tested as `evoland:::private_function`, because a package can be tested after installation.
Be parsimonious when constructing tests.

- Test the full package using `R -e "tinytest::build_test_install()"`.
- Test individual files using `R -e "pkgload::load_all(); tinytest::run_test_file('inst/tinytest/somefile.R')"`

## Documentation

Use markdown roxygen for function and class documentation.
Use `R -e "roxygen2::roxygenize()"` to synchronize roxygen comments and `.Rd` files.
Writte tutorials and how-to as quarto files in `vignettes/*.qmd`.

## Rcpp components

C++ code is in the `src/` folder.
This code interfaces with R using Rcpp.
Build binaries using `pkgbuild::build()`.
Clean build objects using `pkgbuild::clean_dll()`.
Prefer `Rcpp` types.
Only write Rcpp-free headers when the code should also compile as a standalone program.

## Database

- Storage is done in parquet files written and read via an in-memory DuckDB instance.
- `R/parquet_db.R` specifies an R6 class that provides database operations to write and retrieve `data.table` objects to parquet files.
- `parquet_db_t` is a subclass of the `data.table` S3 class, see `R/parquet_db_utils.R`
- `parquet_db_t` objects can hold attributes used to define
- key columns, i.e. uniqueness columns
- hive partitioning columns
- map columns, i.e. R list columns of named lists translated to DuckDB MAP columns
- Domain specific database elements are in `R/evoland_db.R`; `evoland_db` inherits from `parquet_db`.
- The schema for this database is (for now) distributed across the class definitions: all `R/*_t.R` files contain `as_*_t` class constructors using `as_parquet_db_t`.
- Because the parquet files may be written to from external tools, they should be considered part of the API. Schema changes should be avoided as much as possible.
- Ad-hoc views are suffixed `_v` and generally exposed as active bindings, or as methods if they are parameterized.
- Every new method must be added in `R/evoland_db.R` and _not_ with `$set`. This is because the roxygen documentation routine for R6 objects relies on all documentation being available in a single file.
6 changes: 3 additions & 3 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -19,13 +19,16 @@ Imports:
DBI,
duckdb (>= 1.5.2),
glue,
mlr3,
mlr3filters,
qs2,
R6,
Rcpp,
terra
Suggests:
base64enc,
butcher,
mlr3viz,
pROC,
processx,
quarto,
Expand All @@ -41,7 +44,6 @@ Collate:
'trans_models_t.R'
'alloc_dinamica.R'
'coords_t.R'
'covariance_filter.R'
'parquet_db_utils.R'
'parquet_db.R'
'evoland_db.R'
Expand All @@ -61,8 +63,6 @@ Collate:
'reporting_t.R'
'runs_t.R'
'trans_meta_t.R'
'trans_models_glm.R'
'trans_models_rf.R'
'trans_pot_t.R'
'trans_preds_t.R'
'trans_rates_t.R'
Expand Down
5 changes: 0 additions & 5 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,6 @@ export(as_trans_preds_t)
export(as_trans_rates_t)
export(calc_fuzzy_similarity)
export(calc_transition_similarity)
export(covariance_filter)
export(create_change_map)
export(create_coords_t_square)
export(create_intrv_meta_t)
Expand All @@ -71,10 +70,6 @@ export(evoland_db)
export(exec_dinamica)
export(extract_using_coords_t)
export(extrapolate_trans_rates)
export(fit_glm)
export(fit_ranger)
export(gof_glm)
export(gof_ranger)
export(grrf_filter)
export(parquet_db)
export(print_rowwise_yaml)
Expand Down
178 changes: 0 additions & 178 deletions R/covariance_filter.R

This file was deleted.

Loading
Loading