Skip to content

LTER-LIFE/FDFDT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

77 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A hands-on guide to FAIR and structured ecological data

a product of the 'FAIR Data for Digital Twins' (FDFDT) project

Table of Contents

Guide

Link

You can find the guide here: https://lter-life.github.io/FDFDT-Manual/

Description

This repository contains the R code and the enhanced datasets of the practical work in the FAIR Data for Digital Twins project that underlies "A hands-on guide to FAIR and structured ecological data".

For the code that builds the website of the guide, go to the FDFDT-Manual repository.

Authors

Feedback

We are happy to receive your feedback, suggestions or questions on the guide or the code, scripts and datasets! For that, please open a new issue here.

Example datasets and code

The R codes include general functions used for all example datasets and dataset-specific scripts.

Note: Not all scripts are needed to reproduce the examples in the guide.

General functions

Folder: R

Files:

  • assign_uuid.R
    generates a universally unique identifier (UUID) for a dataset that can be used as packageID in the EML file; saves the UUID and dataset name in a look-up table and checks whether the entered dataset has a UUID already before generating a new one
  • create-meta-xml-of-DwCA.R
    creates the meta.xml file of a Darwin Core Archive by looking up the IRI for each Darwin Core term (column name) in each file (core & extensions) of the Darwin Core Archive and saving them in the required format as a XML file
  • retrieveData-API-Dataverse.R
    retrieves data through the API of DataverseNL, DataverseNL (demo version) and the DANS Data Station; requires the DOI of the target dataset and a personal API token (see below)

Generate an API token for DataverseNL/DANS Data Station

To generate your personal API token for DataverseNL, visit the Log In page. If you are affiliated with a partner institution, you can directly log in with your institution credentials to create an account. Otherwise click on "Sign up for a Dataverse account." and fill in the form. After you logged in, click on your user profile name (top right) and select "API Token" in the drop-down menu and create your personal API token by clicking on "Create API token".

To generate your personal API token for the DANS Data Station, visit this Log In page and choose a log in option (GitHub, Google, Institutional Account). After you logged in, click on your user profile name (top right) and select "API Token" in the drop-down menu and create your personal API token by clicking on "Create API token".

Bud burst data

Description: Long-term data on the leaf phenology of different tree species across the Netherlands collected by the NIOO-KNAW.

Folder: R/budburst

Files:

  • 01_budburst_retrieveData-SQLServer.R
    for NIOO-KNAW internal use only; retrieves the raw data files from the local database and creates README for the dataset to prepare upload to DataverseNL
  • 02_budburst_map-to-DarwinCore.R
    maps the bud burst data retrieved from Dataverse to Darwin Core and creates core file, extension files and meta.xml file of the Darwin Core Archive (requires DataverseNL API token to run)
  • 03_budburst_create-EML-xml.R
    creates the EML file for the bud burst data
  • 04_budburst_zip-files-to-DwC-Archive.R
    collects all files of the Darwin Core Archive and saves them in a ZIP folder

Cricket data

Description: Experimental data on a feeding experiment on the European field cricket (Gryllus campestris) in the Netherlands (Vogels et al., 2021).

Folder: R/crickets

Files:

  • crickets-map-to-DarwinCore.R
    maps the cricket data retrieved from the DANS Data Station to Darwin Core and creates core file, extension files and meta.xml file of the Darwin Core Archive (requires DANS Data Station API token to run)
  • crickets_create_EML-xml.R
    creates the EML file for cricket data

CLUE data

Description: Vegetation data on a long-term grassland biodiversity field experiment in the Netherlands. This data is not (yet) available online and only accessible internally.

Note: Consists of two datasets for two different experiments (i.e., exp1 & exp2). As the datasets are very similar, they are treated as one in the guide.

Folder: R/CLUE

Files:

  • CLUE-exp1_map-to-DwC.R
    maps the CLUE data for experiment 1 to Darwin Core and creates core file, extension files and meta.xml file of the Darwin Core Archive (and removes these files again, as the data is not yet published)
  • CLUE-exp1_create-EML.R
    creates the EML file for the CLUE data of experiment 1
  • CLUE-exp2_map-to-DwC.R
    maps the CLUE data for experiment 2 to Darwin Core and creates core file, extension files and meta.xml file of the Darwin Core Archive (and removes these files again, as the data is not yet published)
  • CLUE-exp2_create-EML.R
    creates the EML file for the CLUE data of experiment 2
  • retrieve-taxonInformation-from-GBIF.R
    retrieves taxonomic information from GBIF; both datasets include a large number of plant species with several misspellings in the scientific names or synonyms are used, which makes the automatic retrieval of taxonomic information difficult. This function corrects for these difficulties and checks in the Global Name Resolver for the most commonly used author information.

About

FAIR Data for Digital Twins

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages