Bold_Join

Script allows for specimens data retrieval from BOLD, using its Public API.

In order for the data to be extracted Minimum Data Threshold needs to be met by individual BOLD records. Threshold requirments:

Taxonomic information available (at least Phylum, Classs and Order)
Specified BIN (Barcode Index Number)
Country of collection and collection site coordinates.

Retrieved data is curated:

duplicates are removed;
for given BIN, if specimen taxonomy is missing, results are compared with each other in order to establish taxonomic template;
process_id values (unique to BOLD) and collectors names of duplicates are appended to first result and combined into single result;
only results with geolocation info are saved;
total number of results from Bold is specified;
total number of results, which passes thresholds is specified;
taxonomic information of all specimen data are compared with NCBI Taxonomy Browser, and if conflict is detected - marked as "To review"

Output file is composed of following information:

Process_id (unique BOLD reference number)
Institution storing
BIN
Taxonomic information
Collected by
Identification provided by
Country
Latitude and Longitude
Date

Usage Information:

Please run all the scripts in following order:

For simplified .csv output: module_1 -> module_2 -> module_3
For .json output: module_1 -> module_2 -> module_4_parsing_json

Module 1 requires user input (one or more taxa of interest, separated by '|') - for example: Clitellata|Collembola. All other modules will use output files from previous modules as their input.

Additional Information:

Scripts were run and tested using Python 3.9.6 and use almost only build-in packages.

Required package to install:

BeautifulSoup4; Installation instruction can be find here: https://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-beautiful-soup

Project information

This project was performed as part of the EUdaphobase COST Action (CA18237) www.EUdaphobase.org.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
Examples		Examples
README.md		README.md
module_1.py		module_1.py
module_2.py		module_2.py
module_3.py		module_3.py
module_4_parsing_json.py		module_4_parsing_json.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bold_Join

Usage Information:

Additional Information:

Project information

About

Releases

Packages

Languages

LukaszSitko/Bold_Join

Folders and files

Latest commit

History

Repository files navigation

Bold_Join

Usage Information:

Additional Information:

Project information

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages