A hands-on guide to FAIR and structured ecological data

a product of the 'FAIR Data for Digital Twins' (FDFDT) project

Guide

Link

You can find the guide here: https://lter-life.github.io/FDFDT-Manual/

Description

This repository contains the R code and the enhanced datasets of the practical work in the FAIR Data for Digital Twins project that underlies "A hands-on guide to FAIR and structured ecological data".

For the code that builds the website of the guide, go to the FDFDT-Manual repository.

Authors

Cherine Jantzen / NIOO-KNAW (ORCID: 0009-0006-0723-2682)
Stefan Vriend / NIOO-KNAW (ORCID: 0000-0002-9006-5988)

Feedback

We are happy to receive your feedback, suggestions or questions on the guide or the code, scripts and datasets! For that, please open a new issue here.

Example datasets and code

The R codes include general functions used for all example datasets and dataset-specific scripts.

Note: Not all scripts are needed to reproduce the examples in the guide.

General functions

Folder: R

Files:

assign_uuid.R
generates a universally unique identifier (UUID) for a dataset that can be used as packageID in the EML file; saves the UUID and dataset name in a look-up table and checks whether the entered dataset has a UUID already before generating a new one
create-meta-xml-of-DwCA.R
creates the meta.xml file of a Darwin Core Archive by looking up the IRI for each Darwin Core term (column name) in each file (core & extensions) of the Darwin Core Archive and saving them in the required format as a XML file
retrieveData-API-Dataverse.R
retrieves data through the API of DataverseNL, DataverseNL (demo version) and the DANS Data Station; requires the DOI of the target dataset and a personal API token (see below)

Generate an API token for DataverseNL/DANS Data Station

To generate your personal API token for DataverseNL, visit the Log In page. If you are affiliated with a partner institution, you can directly log in with your institution credentials to create an account. Otherwise click on "Sign up for a Dataverse account." and fill in the form. After you logged in, click on your user profile name (top right) and select "API Token" in the drop-down menu and create your personal API token by clicking on "Create API token".

To generate your personal API token for the DANS Data Station, visit this Log In page and choose a log in option (GitHub, Google, Institutional Account). After you logged in, click on your user profile name (top right) and select "API Token" in the drop-down menu and create your personal API token by clicking on "Create API token".

Bud burst data

Description: Long-term data on the leaf phenology of different tree species across the Netherlands collected by the NIOO-KNAW.

Folder: R/budburst

Files:

01_budburst_retrieveData-SQLServer.R
for NIOO-KNAW internal use only; retrieves the raw data files from the local database and creates README for the dataset to prepare upload to DataverseNL
02_budburst_map-to-DarwinCore.R
maps the bud burst data retrieved from Dataverse to Darwin Core and creates core file, extension files and meta.xml file of the Darwin Core Archive (requires DataverseNL API token to run)
03_budburst_create-EML-xml.R
creates the EML file for the bud burst data
04_budburst_zip-files-to-DwC-Archive.R
collects all files of the Darwin Core Archive and saves them in a ZIP folder

Cricket data

Description: Experimental data on a feeding experiment on the European field cricket (Gryllus campestris) in the Netherlands (Vogels et al., 2021).

Folder: R/crickets

Files:

crickets-map-to-DarwinCore.R
maps the cricket data retrieved from the DANS Data Station to Darwin Core and creates core file, extension files and meta.xml file of the Darwin Core Archive (requires DANS Data Station API token to run)
crickets_create_EML-xml.R
creates the EML file for cricket data

CLUE data

Description: Vegetation data on a long-term grassland biodiversity field experiment in the Netherlands. This data is not (yet) available online and only accessible internally.

Note: Consists of two datasets for two different experiments (i.e., exp1 & exp2). As the datasets are very similar, they are treated as one in the guide.

Folder: R/CLUE

Files:

CLUE-exp1_map-to-DwC.R
maps the CLUE data for experiment 1 to Darwin Core and creates core file, extension files and meta.xml file of the Darwin Core Archive (and removes these files again, as the data is not yet published)
CLUE-exp1_create-EML.R
creates the EML file for the CLUE data of experiment 1
CLUE-exp2_map-to-DwC.R
maps the CLUE data for experiment 2 to Darwin Core and creates core file, extension files and meta.xml file of the Darwin Core Archive (and removes these files again, as the data is not yet published)
CLUE-exp2_create-EML.R
creates the EML file for the CLUE data of experiment 2
retrieve-taxonInformation-from-GBIF.R
retrieves taxonomic information from GBIF; both datasets include a large number of plant species with several misspellings in the scientific names or synonyms are used, which makes the automatic retrieval of taxonomic information difficult. This function corrects for these difficulties and checks in the Global Name Resolver for the most commonly used author information.

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
R		R
data		data
project		project
readme-templates		readme-templates
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

A hands-on guide to FAIR and structured ecological data

Table of Contents

Guide

Link

Description

Authors

Feedback

Example datasets and code

General functions

Generate an API token for DataverseNL/DANS Data Station

Bud burst data

Cricket data

CLUE data

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

A hands-on guide to FAIR and structured ecological data

Table of Contents

Guide

Link

Description

Authors

Feedback

Example datasets and code

General functions

Generate an API token for DataverseNL/DANS Data Station

Bud burst data

Cricket data

CLUE data

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages