Skip to content

Commit 2a78603

Browse files
committed
add content, unfinished
1 parent a78dd15 commit 2a78603

File tree

3 files changed

+168
-3
lines changed

3 files changed

+168
-3
lines changed

sphinx/software_desc/glossary.rst

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
Glossary
2+
#############
3+
4+
Specify-defined terms
5+
6+
7+
.. glossary::
8+
9+
Algorithm
10+
An algorithm is a procedure or formula for solving a problem. There are multiple
11+
algorithms for computing Species Distribution Models (SDM) which
12+
define the relationship between a set of points and the environmental values
13+
at those points.
14+
15+
Container
16+
A :term:`Docker` instance which runs as an application on a :term:`Host machine`.
17+
The Docker container contains all software dependencies required by the programs it
18+
will run.
19+
20+
CSV
21+
CSV (Comma Separated Values) is a file format for records, in which fields are
22+
separated by a delimiter. Commas and tabs are common, but other characters may
23+
be used as delimiters.
24+
25+
Data Catalog
26+
The Specify database of tables and fields containing all data related to one or
27+
more Collections.
28+
29+
Data Validation
30+
Testing processes to run on data values ensuring that they meet the conditions
31+
defined for the field, table, and database. This could include data type,
32+
formatting, content restrictions such as a controlled vocabulary, numeric range,
33+
or existence of an database identifier.
34+
35+
Docker
36+
Docker is an application which can run on Linux, MacOSX, or Windows. With a
37+
Docker-ized application, such as this tutorial, a user can run the application on
38+
their local machine in a controlled and sequestered environment, with a set of
39+
dependencies that may not be easy, allowed, or even available for their local
40+
machine.
41+
42+
Docker image
43+
A Docker-ized application, built into a single package, with all required
44+
software dependencies and files.
45+
46+
DwCA
47+
DwCA (Darwin Core Archive) is a packaged dataset of occurrence records in `Darwin
48+
Core standard <https://www.tdwg.org/standards/dwc/>`_ format, along with metadata
49+
about the contents.
50+
51+
Host machine
52+
A physical or virtual machine on which Docker can be run.
53+
54+
Mapped Spreadsheet
55+
A spreadsheet that has a mapping document that matches spreadsheet columns to
56+
Specify database fields.
57+
58+
Mapping Template
59+
A document that matches terms in a :term:`Mapped Spreadsheet` to fields in the
60+
Specify database.
61+
62+
Occurrence
63+
An occurrence is a record of a specimen occurrence including metadata about the
64+
specimen and the spatial location where it was found.
65+
66+
Occurrence Data
67+
Point data representing specimens collected for a single species or taxon. Each
68+
data point
69+
contains a location, x and y, in some known geographic spatial reference system.
70+
71+
Tree
72+
A Tree is a set of hierarchical data. Several tables in Specify are defined as
73+
trees: Taxonomy, Geography, Storage Location.

sphinx/software_desc/migration.rst

Lines changed: 20 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,24 @@ via Specify’s Workbench tool, Specify’s API, or with SQL scripts directly in
2525
backend Specify database (MariaDB). SCC staff would work with the Migration team to
2626
choose specific processing options and methods for data migrations.
2727

28-
SCC technical staff will train the Point Person and Migration team involved in database
29-
setup and data migration to understand Specify at the Support level. The contacts should plan to allocate a week to visit our SCC headquarters in Kansas. There they would work intensively one-on-one with our technical support staff and software engineers to attain a database administrator level of mastery. After the visit, SCC staff will continue to meet with the person or team as needed over Zoom to discuss questions and to research and resolve issues that arise. The Buyer would be responsible for their staff travel expenses, the SCC will allocate staff time and project resources at no cost.
30-
We can facilitate meetings with other large national organizations who have undergone the same process of assessing collections’ requirements, deciding on configuration and customization options, preparing data for migration, and then importing data into Specify. We have worked with several organizations of a similar size and scale in transitions to Specify including: the Danish Natural History Museums, the Canadian Laurentian Forestry Centre, and the Australian federal government’s CSIRO. Each member has taken a somewhat custom transition to move to Specify based on the organization of local technical expertise and desired outcomes.
28+
SCC technical staff will train a Point Person and Migration team involved in database
29+
setup and data migration to understand Specify at the Support level. These individuals
30+
should plan to allocate a week to visit SCC headquarters in Kansas. They will work
31+
one-on-one with SCC technical support staff and software engineers to attain a
32+
database administrator level of mastery. After the visit, SCC staff will continue to
33+
meet with the person or team as needed over Zoom to discuss questions and to research
34+
and resolve issues that arise.
35+
36+
Members are responsible for their staff travel expenses,
37+
the SCC will allocate staff time and project resources at no cost.
38+
SCC can facilitate meetings with other large national organizations who have undergone
39+
similar processes of assessing collections’ requirements, deciding on configuration
40+
and customization options, preparing data for migration, and importing data into
41+
Specify.
42+
43+
SCC has worked with several large-scale organizations in transitions to
44+
Specify, including: the Danish Natural History Museums, the Canadian Laurentian
45+
Forestry Centre, and the Australian federal government’s CSIRO. Each member has taken
46+
a custom transition to move to Specify based on technical expertise and desired
47+
outcomes.
3148

sphinx/software_desc/workflows.rst

Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
Common Workflows
2+
##################
3+
4+
Field collection of specimens
5+
******************************************
6+
7+
Specify contains a table for Permits, which can be configured at the Institution level
8+
to be associated with **Accession**, **Collecting Event**, or **Collecting Trip**
9+
("Permit Associated Record") allowing the user to document permits
10+
acquired for collecting and cataloging specimens. Either the Permit or the
11+
Permit Associated Record can be created first, and either can be linked to an
12+
existing record.
13+
14+
SCC recommends sending researchers out into the field with a spreadsheet that has a
15+
"Mapping Template" that matches spreadsheet columns to Specify database fields. The
16+
spreadsheet for data entry is referred to as a "Mapped Spreadsheet".
17+
Mapped Spreadsheets can be created for
18+
research expeditions, focusing on data collection specific to that field trip.
19+
Researchers can easily use the spreadsheet in the field to record information about
20+
specimens, collecting event(s), locality, and more.
21+
22+
Data entered into a Mapped Spreadsheet can be imported via the Specify Workbench, a
23+
spreadsheet-based application. In this workflow, the user chooses the correct Mapping
24+
Template, the uploads the Mapped Spreadsheet to the Workbench. At this stage, the
25+
Workbench is completely external to the data catalog. The Workbench contains
26+
extensive matching and editing features that can be edited to fit user needs.
27+
The Workbench then performs Data Validation on spreadsheet contents before submitting
28+
the data for upload to the Data Catalog.
29+
30+
It allows the user to bring in bulk
31+
data, match columns to fields in Specify, and perform basic data integrity checks to
32+
ensure the data matches database requirements (data validity, controlled vocabulary
33+
matching, and linked record matching, such as Agent or Taxon records).
34+
35+
Alternatively, a researcher could create a custom spreadsheet and simply create a mapping template before importing data to Specify.
36+
37+
Users first validate the spreadsheet data within the Workbench, then upload the verified data to the database. Users may be assigned different levels of access to the Workbench functionality, such as permission to validate a dataset with the Workbench, or perform the upload, so different people may verify that the data is sound.
38+
39+
Because Specify 7 is an online software, if there is internet access there is the option to create a dataset directly in the Specify Workbench from the field. This workflow allows researchers to verify data against database requirements on entry.
40+
41+
More information on our Workbench application is available here: https://discourse.specifysoftware.org/t/the-specify-7-workbench/540
42+
43+
Object entry
44+
******************************************
45+
The user has three ways to document information pre-cataloging: 1) record the data in the Specify Workbench, 2) enter it into the database and mark it as different from cataloged data, or 3) put the data into a separate collection and bring the data back over once it is ready to be cataloged.
46+
If the data is in the Workbench, it is segregated from queries and exports, search boxes which search for existing records to match (such as Agent or Taxon records) do not find this information. However, that means it may not be linked to Loans or any other interactions.
47+
A user can have multiple active, not uploaded datasets, but datasets are uploaded as a file, meaning all items in a dataset are uploaded at the same time. If you have cataloged only a portion of the items in the dataset, you will either have to wait for the rest to be cataloged before uploading or move the uncataloged records to another dataset.
48+
If the data is added to the database but marked as separate, the data would be included in any search box searches and could be included in Loans or other Interactions. The data could be excluded from queries or data exports by adding the field used to mark it as separate. The user could have a checkbox field to indicate if it is cataloged, or use the lack of a catalog number as an indicator.
49+
If the data is added to a separate collection in the same database, the data would be completely separate from the cataloged data, but could be treated as a collection object, meaning all database and curatorial treatments could be documented. A script could be set to run automatically to bring the data over from the separate collection to the cataloged collection, ideally using the Specify API, each night for records that meet a requirement, such as a catalog number has been added, indicating it has been cataloged.
50+
Each of these approaches to pre-cataloged data has been successfully used by existing SCC members.
51+
52+
Acquisition and accessioning
53+
******************************************
54+
55+
Location and movement control
56+
******************************************
57+
58+
Cataloguing
59+
******************************************
60+
61+
Loans in (borrowing objects)
62+
******************************************
63+
64+
Loans out (lending objects)
65+
******************************************
66+
67+
Use of collections
68+
******************************************
69+
70+
Condition checking and improvement
71+
******************************************
72+
73+
Deaccessioning and disposal
74+
******************************************
75+

0 commit comments

Comments
 (0)