This repo holds the schema and associated scripts used by the Dataset Citation Metadata project.
The Dataset Citation Metadata is a project aimed at ensuring that appropriate citation information exists for data entering and/or produced by biological and environmental research platforms to allow credit to be attributed to those who produced the data.
The Dataset Citation Metadata schema is maintained in LinkML format; other formats (including the python class) can be generated from the LinkML schema file.
See the LinkML documentation for full details on using the LinkML format and the related tools.
Full schema documentation can be found at https://kbase.github.io/credit_engine/.
See also some additional information on the schema design.
Generated from the Pydantic version of the Dataset Credit Metadata Schema using erdantic.
See below for how to regenerate the ER diagram after making changes to the schema.
This repo uses uv to manage the python environment and dependencies.
See the uv docs for uv installation instructions.
Install the project dependencies and create a virtual environment:
uv syncRun tests or other scripts:
uv run <command>
uv run pytest tests/These assume that you have already run uv sync to install the credit engine virtual environment and dependencies.
generate derived files in all formats and save them to the project directory:
uv run gen-project -d project/ schema/dcm/linkml/credit_metadata.yamllint the LinkML schema file:
uv run linkml-lint -f terminal schema/dcm/linkml/credit_metadata.yamlvalidate data (in file data.yaml) against the schema:
uv run linkml-validate -s schema/dcm/linkml/credit_metadata.yaml data.yamlgenerate JSON Schema version:
uv run gen-json-schema schema/dcm/linkml/credit_metadata.yaml > schema/dcm/jsonschema/credit_metadata.schema.jsongenerate Python classes:
uv run gen-python schema/dcm/linkml/credit_metadata.yaml > schema/dcm/python/credit_metadata.pygenerate Pydantic classes:
uv run gen-pydantic schema/dcm/linkml/credit_metadata.yaml > schema/dcm/python/credit_metadata_pydantic.pygenerate an ER diagram from the Pydantic classes using erdantic (assumes that erdantic has been installed already):
uv run erdantic schema.dcm.python.credit_metadata_pydantic.CreditMetadata -o schema/dcm/dcm-schema.pnginstall the JSONschema check script:
# install with Homebrew
brew install check-jsonschemaor
# install with pip
pip install check-jsonschemaTo test a file or files against the schema, use the command:
check-jsonschema --schemafile schema/dcm/jsonschema/credit_metadata.schema.json data_file_1.json data_file_2.jsonor
check-jsonschema --schemafile schema/dcm/jsonschema/credit_metadata.schema.json sample_data/**/*_dcm.jsonAssumes that Graphviz has been installed using Homebrew.
uv add --config-settings="--global-option=build_ext" --config-settings="--global-option=-I$(brew --prefix graphviz)/include/" --config-settings="--global-option=-L$(brew --prefix graphviz)/lib/" pygraphviz --devSee pygraphviz installation instructions for more information.
