Skip to content

Latest commit

 

History

History
1043 lines (819 loc) · 55.6 KB

File metadata and controls

1043 lines (819 loc) · 55.6 KB

Datastore API Specification

Version 1.0.0

The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 RFC2119 RFC8174 when, and only when, they appear in all capitals, as shown here.

This document is licensed under The Apache License, Version 2.0.

Disclaimer

Part of this content has been taken from the great work done by the folks at the OpenAPI Initiative and AsyncAPI Initiative. We have decided to not reinvent the wheel and inspire our work to these two specifications mainly for the following reasons:

  • We think that the work made by the OpenAPI Initiative and AsyncAPI Initiative is great :)
  • We want to make the learning curve for the Datastore API Specification as smooth as possible, aligning its definition to the one of other two popular specifications in the software and data engineers community

Introduction

The Datastore API Specification (DSAS) defines a standard, language-agnostic interface to a Data API which allows both humans and computers to understand how to establish a connection and query a database service managing tabular data without access to source code, documentation, or through network traffic inspection. When properly defined, a consumer can understand and interact with the remote database service with a minimal amount of implementation logic.

A Datastore API definition can then be used by documentation generation tools to display the API, code generation tools to generate servers and clients in various programming languages, testing tools, and many other use cases.

Table of Contents

Definitions

Standard

The set of shared rules used by different agents to describe an entity or process of common interest. The agents that follow the standard limit their autonomy by conforming to the set of shared rules in order to facilitate cooperation between them through interoperability.

Standard Specification

The formal description of the rules that form a standard. A standard can have multiple specification versions associated with it. Sometimes the words standard and specification are used as synonymous.

Standard Definition

The description of one specific entity or process created using and conforming to the set of rules formally described in the standard specification

Database Management System

An application that is capable of storing and providing access to data organized in tabular format (ex. MySql, SQLServer, Snowflake, etc...)

Database Service

An addressable running instance of a Database Management System. Consumers can connect to it through one or many connection protocols (ex. JDBC, ODBC, etc...) and perform queries over tabular data managed by the service. The supported query language depends on the specific Database Management System (ex. SQL).

Database

A named collection of tables physically stored and exposed to consumers by a Database Service. In some Database Management Systems tables in a database are further grouped in schemas.

Datastore API

The description of the structure of a collection of tables (i.e. data store schema) together with the Database Services (i.e. data store services) that store them in the different environments that compose the application landscape (ex. dev, qa, prod, etc...). Consumers can connect to data store services through one of the supported protocols and use the data store schema to compose valid queries. The structure of the tables that compose a Datastore API is the same in all environments so the same queries can be executed against all the services independently from the specific environment. The stored data is usually not the same (i.e., dev data is generally different from prod data) as are the query results.

Datastore API Document

The document (or set of documents) that contains the standard definition of a Datastore API created using and conforming to the Datastore API Specification.

Datastore API Specification

The formal description of the rules to create a standard-compliant Datastore API Document.

Specification

Versions

The Datastore API Specification is versioned using Semantic Versioning 2.0.0 (semver) and follows the semver specification.

The major.minor portion of the semver (for example 1.0) SHALL designate the DSAS feature set. Typically, .patch versions address errors in this document, not the feature set. Tooling which supports DSAS 1.0 SHOULD be compatible with all DSAS 1.0.* versions. The patch version SHOULD NOT be considered by tooling, making no distinction between 1.0.0 and 1.0.1 for example.

Each new minor version of the Datastore API Specification SHALL allow any Datastore API document that is valid against any previous minor version of the Specification, within the same major version, to be updated to the new Specification version with equivalent semantics. Such an update MUST only require changing the datastoreapi property to the new minor version.

For example, a valid DSAS 1.0.2 document, upon changing its datastoreapi property to 1.1.0, SHALL be a valid Datastore API 1.1.0 document, semantically equivalent to the original Datastore API 1.0.2 document. New minor versions of the Datastore API Specification MUST be written to ensure this form of backward compatibility.

Format

A Datastore API Document that conforms to the Datastore API Specification is itself a JSON object, which may be represented either in JSON or YAML format.

For example, if a field has an array value, the JSON array representation will be used:

{
   "field": [ 1, 2, 3 ]
}

All field names in the specification are case-sensitive. This includes all fields that are used as keys in a map, except where explicitly noted that keys are case insensitive.

The schema exposes two types of fields: Fixed fields, which have a declared name, and Patterned fields, which declare a regex pattern for the field name.

Patterned fields MUST have unique names within the containing object.

In order to preserve the ability to round-trip between YAML and JSON formats, YAML version 1.2 is RECOMMENDED along with some additional constraints:

Document Structure

A Datastore API Document MAY be made up of a single document or be divided into multiple, connected parts at the discretion of the user. In the latter case, $ref fields MUST be used in the specification to reference those parts as follows from the JSON Schema definitions.

It is RECOMMENDED that the root Datastore API Document be named: datastoreapi.json or datastoreapi.yaml.

Object Types

A Datastore API Document has one and only one root object. The properties of an object are described by its fields. A field type can be another object or a primitive type. An addressable and versioned object is called entity. The root object of the Datastore API Document is an entity object. Other entities that exist only in the scope of the root entity are called components.

Data Types

Primitive data types in the DSAS are based on the types supported by the JSON Schema Specification Wright Draft 00.

Primitives have an optional modifier property: format. DSAS uses several known formats to define in fine detail the data type being used. However, to support documentation needs, the format property is an open string-valued property and can have any value. Formats such as "email", "uuid", and so on, MAY be used even though undefined by this specification. Types that are not accompanied by a format property follow the type definition in the JSON Schema. Tools that do not recognize a specific format MAY default back to the type alone as if the format is not specified.

The formats defined by the DSAS are:

type format Comments
integer int32 signed 32 bits
integer int64 signed 64 bits (a.k.a. long)
number float
number double
string
string alphanumeric a string that match the following regex ^[a-zA-Z0-9]+$
string name a string that match the following regex ^[a-zA-Z][a-zA-Z0-9]+$
string fqn a string that match the following regex ^[a-zA-Z][a-zA-Z0-9.:]+$
string version a string that match the following regex `^(0
string byte base64 encoded characters
string binary any sequence of octets
string uuid a sequence of 16 octets as defined by RFC4122
boolean
string date As defined by full-date - RFC3339
string date-time As defined by date-time - RFC3339
string password A hint to UIs to obscure input.

Rich Text Formatting

Throughout the specification, description fields are noted as supporting CommonMark markdown formatting. Where Data Product Descriptor tooling renders rich text it MUST support, at a minimum, markdown syntax as described by CommonMark 0.27. Tooling MAY choose to ignore some CommonMark features to address security concerns.

Relative References in URLs

Unless specified otherwise, all properties that are URLs SHOULD be absolute references. If a property explicitly specifies in its description that allows a relative reference its value MUST be compliant with RFC3986. Relative references MUST be resolved using the URLs defined in the property description as a Base URI.

Relative references used in $ref are processed as per JSON Reference, using the URL of the current document as the base URI. See also the Reference Object.

Schema

In the following description, if a field is not explicitly REQUIRED or described with a MUST or SHALL, it can be considered OPTIONAL.

Datastore API Entity

This is the root object of the Datastore API Document.

Fixed Fields
Field Name Type Description
datastoreapi string:version (REQUIRED) The semantic version number of the Datastore API Specification Version that the Datastore API Document uses. The datastoreapi field SHOULD be used by tools and clients to interpret the Datastore API Document. This is not related to the version field that specifies the actual version of the specific Datastore API Document.
info Info Object (REQUIRED) Provides metadata about the API. The metadata MAY be used by tooling as required.
services Database Services Object (REQUIRED) Provides connection details of services that expose the data of this data store in all the supported environments.
schema Schema Object (REQUIRED) Describes the structure of all the tables that compose this data store.

This object MAY be extended with Specification Extensions.

Info Object

The Info Object provides metadata about the API. The metadata can be used by the platform or by consumers if needed.

Fixed Fields
Field Name Type Description
title string (REQUIRED) The title of the API.
summary string The short summary of the API.
description string The description of the API. CommonMark syntax MAY be used for rich text representation.
termsOfService string The URL to the terms of service for the API. This MUST be in the form of a URL.
version string:version (REQUIRED) The version of the Datastore API Document (which is distinct from the Datastore API Specification version or the API implementation version).
datastoreName string:name The name of the datastore exposed by this API.
contact Contact Object The contact information for this API.
license License Object The license information for this API.

This object MAY be extended with Specification Extensions.

Info Object Example
{
  "title": "Foodmart Sales Data Store",
  "summary": "The sales datamart",
  "description": "This fact table stores all the sales of the last five years together with key analysis dimensions (ex. customer, products, etc...)",
  "termsOfService": "https://foodmart.com/terms/",
  "contact": {
    "name": "API Support",
    "url": "https://www.foodmart.com/support",
    "email": "[email protected]"
  },
  "license": {
    "name": "Apache 2.0",
    "url": "https://www.apache.org/licenses/LICENSE-2.0.html"
  },
  "version": "1.1.1"
}
title: "Foodmart Sales Data Store"
summary: "The sales datamart"
description: >
  This fact table stores all the sales of the last five years together with key analysis dimensions (ex. customer, products, etc...)
termsOfService: "https://foodmart.com/terms/"
contact:
  name: "API Support"
  url: "https://www.foodmart.com/support"
  email: "[email protected]"
license:
  name: "Apache 2.0"
  url: "https://www.apache.org/licenses/LICENSE-2.0.html"
version: "1.1.1"

Contact Object

Contact information for the exposed API.

Fixed Fields
Field Name Type Description
name string The identifying name of the contact person/organization.
url string The URL pointing to the contact information. MUST be in the format of a URL.
email string The email address of the contact person/organization. MUST be in the format of an email address.

This object MAY be extended with Specification Extensions.

Contact Object Example:
{
  "name": "API Support",
  "url": "https://www.example.com/support",
  "email": "[email protected]"
}
name: "API Support"
url: "https://www.example.com/support"
email: "[email protected]"

License Object

License information for the exposed API.

Fixed Fields
Field Name Type Description
name string (REQUIRED) The license name used for the API.
url string A URL to the license used for the API. MUST be in the format of a URL.

This object MAY be extended with Specification Extensions.

License Object Example:
{
  "name": "Apache 2.0",
  "url": "https://www.apache.org/licenses/LICENSE-2.0.html"
}
name: "Apache 2.0"
url: "https://www.apache.org/licenses/LICENSE-2.0.html"

Database Services Object

The Database Services Object maps database services to supported environments (ex. dev, test, prod, etc.).

Patterned Fields
Field Name Type Description
- Map[string,Database Service Object | Reference Object ] The definition of a server that exposes the API in a specific environment.
Servers Object Example
{
	"development": {
	  "$ref": "#components.services.foodmartDevelopmentService"
	},
	"production": {
	  "$ref": "#components.services.foodmartProductionService"
	}
}
development:
  $ref: "#components.services.foodmartDevelopmentService"
production:
  $ref: "#components.services.foodmartProductionService"

Database Service Object

The Database Service Object describes a database service and provides all the information required to establish a connection to it.

Fixed Fields
Field Name Type Description
name string:name (REQUIRED) The name of the service. It MUST be unique within the services available for the API. It is RECOMMENDED to use a unique name for all services running in the application landscape.
description string An optional string describing the service. CommonMark syntax MAY be used for rich text representation.
serverInfo Server Info Object | Reference Object Contains basic information about the Database Server.
variables Map[string, Variable Object] The map between a variable name and its value. The value is used for substitution in the protocols' connectionString template.

This object MAY be extended with Specification Extensions.

Database Service Object Example

The following shows an example of a Database Service Object, including how variables can be used for a server configuration:

{
  "name:": "SALES Data Store Service",
  "description": "The service that host the `SALES` data store in the given environment",
  "serverInfo": {
      "host:": "{host}",
      "port:": "5432",
      "dbmsType:": "Postgres",
      "dbmsVersion:": "15 RC 2",
      "connectionProtocols": {
        "jdbc": {
          "version": "1.0",
          "url": "jdbc:postgresql://{hosts}:5432/foodmart",
          "driverName": "PostgreSQL JDBC Driver",
          "driverClass": "org.postgresql.Driver",
          "driverVersion": "42.2.20"
        }
      }
  },
  "variables": {
    "host": "ip-10-24-32-0.ec2.internal"
  }
}	 
name: "SALES Data Store Service"
description: "The service that hosts the `SALES` data store in the given environment"
serverInfo:
  host: "{host}"
  port: "5432"
  dbmsType: "Postgres"
  dbmsVersion: "15 RC 2"
  connectionProtocols:
    jdbc:
      version: "1.0"
      url: "jdbc:postgresql://{hosts}:5432/foodmart"
      driverName: "PostgreSQL JDBC Driver"
      driverClass: "org.postgresql.Driver"
      driverVersion: "42.2.20"
variables:
  host: "ip-10-24-32-0.ec2.internal"

Server Info Object

The Server Info Object contains basic information about the Database Server.

Fixed Fields
Field Name Type Description
host string (REQUIRED) The hostname of the server running the service. It SHOULD follow the guidelines described in RFC1178.
port string (REQUIRED) The port on which the service is listening for incoming requests.
dbmsType string The type of database management system run by the service (ex. MySQL, Postgres, Oracle, etc...).
dbmsVersion string The version of database management system run by the service (ex. 8.0.31, 15 RC 2, 19c, ecc...).
connectionProtocols Connection Protocols Object (REQUIRED) The available protocols to connect to the service.

This object MAY be extended with Specification Extensions.

Server Info Object Example

The following shows an example of Server Info Object, including an example of how variables can be used for a server configuration:

{
    "host:": "{host}",
    "port:": "5432",
    "serviceType:": "Postgres",
    "serviceVersion:": "15 RC 2",
    "connectionProtocols": {
      "jdbc": {
        "version": "1.0",
        "url": "jdbc:postgresql://{hosts}:5432/foodmart",
        "driverName": "PostgreSQL JDBC Driver",
        "driverClass": "org.postgresql.Driver",
        "driverVersion": "42.2.20"
      }
    }
}
host: "{host}"
port: "5432"
serviceType: "Postgres"
serviceVersion: "15 RC 2"
connectionProtocols:
  jdbc:
    version: "1.0"
    url: "jdbc:postgresql://{hosts}:5432/foodmart"
    driverName: "PostgreSQL JDBC Driver"
    driverClass: "org.postgresql.Driver"
    driverVersion: "42.2.20"

Connection Protocols Object

Describes protocol-specific configurations for all connection protocols supported by a service.

Fixed Fields
Field Name Type Description
jdbc JDBC Connection Object Protocol-specific information for a JDBC connection to the service.
odbc ODBC Connection Object Protocol-specific information for a ODBC connection to the service.

This object MAY be extended with Specification Extensions.

JDBC Connection Protocol Object

The JDBC Connection Object contains the required information to create a JDBC connection to the service.

Fixed Fields
Field Name Type Description
version string The version of the protocol used for connection (ex. JDBC 4.3).
connectionString string (REQUIRED). The string that contains all the required information to connect to the service (ex. jdbc:postgresql://192.168.1.170:5432/sample?ssl=true). This string supports [Variables]. Variable substitutions will be made when a variable is named in {brackets}.
driverName string The name of the JDBC driver to use for establishing a connection with the service (ex. PostgreSQL JDBC Driver).
driverClass string The java class of the JDBC driver to use for establishing a connection with the service (ex. org.postgresql.Driver).
driverVersion string The version of the JDBC driver to use for establishing a connection with the service (ex. 42.2.20).
driverLibrary External Resource Object The JDBC driver library.
driverDocs External Resource Object The JDBC driver documentation.

This object MAY be extended with Specification Extensions.

JDBC Connection Protocol Object Example

The following shows an example of JDBC connection information to a PostgreSQL database service:

{
  "version": "1.0",
  "url": "jdbc:postgresql://{hosts}:5432/foodmart",
  "driverName": "PostgreSQL JDBC Driver",
  "driverClass": "org.postgresql.Driver",
  "driverVersion": "42.2.20",
  "driverLibrary": {
    "description": "PostgreSQL JDBC Driver Library",
    "mediaType": "application/java-archive",
    "$href": "https://jdbc.postgresql.org/"
   },
   "driverDocs": {
    "description": "PostgreSQL JDBC Driver HomePage",
     "mediaType": "text/html",
     "$href": "https://jdbc.postgresql.org/postgresql-15RC2.jdbc3.jar"
  }
}
version: "1.0"
url: "jdbc:postgresql://{hosts}:5432/foodmart"
driverName: "PostgreSQL JDBC Driver"
driverClass: "org.postgresql.Driver"
driverVersion: "42.2.20"
driverLibrary:
  description: "PostgreSQL JDBC Driver Library"
  mediaType: "application/java-archive"
  $href: "https://jdbc.postgresql.org/"
driverDocs:
  description: "PostgreSQL JDBC Driver HomePage"
  mediaType: "text/html"
  $href: "https://jdbc.postgresql.org/postgresql-15RC2.jdbc3.jar"

ODBC Connection Protocol Object

The ODBC Connection Object contains the required information to create an ODBC connection to the service.

Field Name Type Description
version string The version of the protocol used for connection (e.g. ODBC 4.0).
connectionString string (REQUIRED). The string that contains all the required information to connect to the service (ex. Driver={ODBC Driver 13 for SQL Server};server=localhost;database=WideWorldImporters;trusted_connection=Yes;). This string supports [Variables]. Variable substitutions will be made when a variable is named in {brackets}.
driverName string The name of the ODBC driver to use for establishing a connection with the service (ex. psqlODBC).
driverVersion string The version of the ODBC driver to use for establishing a connection with the service (ex. 13.02).
driverLibrary External Resource Object The ODBC driver library.
driverDocs External Resource Object The ODBC driver documentation.

This object MAY be extended with Specification Extensions.

Variable Object

The Variable Object represents a Variable for server URL template substitution.

Fixed Fields
Field Name Type Description
description string The optional description for the server variable. CommonMark syntax MAY be used for rich text representation.
enum [string] The enumeration of string values is to be used if the substitution options are from a limited set.
default string The default value to use for substitution, and to send if an alternate value is not supplied.
examples [string] The array of examples of the server variable.

This object MAY be extended with Specification Extensions.

Schema Object

The Schema Object describes the structure of the tables exposed by this API.

Fixed Fields
Field Name Type Description
databaseName string (REQUIRED) The name of the Database that collects the tables exposed by this Datastore API.
databaseSchemaName string The name of the schema that collects the tables exposed by this Datastore API. This field is used only for Database Management System that groups tables within a Database in schemas.
tables [Table Entity| Standard Definition Object | Reference Object] The tables exposed by this Datastore API.

This object MAY be extended with Specification Extensions.

Schema Object Example
{
  "databaseName": "foodmartdb",
  "databaseSchemaName": "dwh",
  "tables": [
    {
      "$ref": "#components.tables.sales"
    },
    {
      "$ref": "#components.services.customers"
    },
    {
      "$ref": "#components.services.products"
    }
  ]
}
databaseName: "foodmartdb"
databaseSchemaName: "dwh"
tables:
  - $ref: "#components.tables.sales"
  - $ref: "#components.services.customers"
  - $ref: "#components.services.products"

Table Entity

The Table Entity describes the structure of a table. This entity's fields are a superset of the ones defined by Table Object of Open Metadata v0.12.1. By consequence, Open Metadata v0.12.1](https://github.com/open-metadata/OpenMetadata/tree/0.12.1-release/) tables are also valid entities usable for describing the schema of a Datastore API.

Fixed Fields
Field Name Type Description
id string:uuid (READONLY) The UUID is generated server-side by applications that work with the table entity. A specific application MUST returns always the same UUID for a given table identified by its fullyQualifiedName. Anyway, different applications MAY use different UUID for the same table. The UUID generated by an application MAY be used to identify the table in subsequent calls to the same application API in place of the more verbose fullyQualifiedName. The UUID generated by one application SHALL NOT be used to identify the table when calling the API exposed by another application. It is RECOMMENDED the usage by applications of an UUID version 3 (RFC-4122) generated as SHA-1 hash of the table's fullyQualifiedName.
fullyQualifiedName string:fqn (REQUIRED) The fully qualified name of the table built by concatenation of datastoreName,databaseName and tableName. It is RECOMMENDED to use an unique universal identifier of the form urn:dsas:{org-namespace}:tables:{datastoreName}:{databaseName}:{tableName}:{table-major-version}. It's RECOMMENDED to use as org-namespace your company's domain name in reverse dot notation (es it.quantyca) in order to ensure that the fullyQualifiedName is unique universal idetifier. Example: "fullyQualifiedName": "urn:dsas:it.quantyca:tables:mysqld-prod:dwh:sales:1". For inbound compatibility with Open Metadata v0.12.1 the fullyQualifiedName MAY also be in the simpler form of datastoreName.databaseName.tableName. Example: "fullyQualifiedName": "mysqld-prod.dwh.sales".
entityType string:alphanumeric (READONLY) The name of the entity used by applications that work with the table entity. Different applications MAY use different entity names to refer to table entity. It's RECOMMENDED to use as entityName the name of the resource exposed by the application's Restful API to execute CRUD operations over the entity itself.
name string The local name (i.e. not fully qualified name) of the table. It MUST be unique within the tables of the same database or schema.
version string:version (REQUIRED) The semantic version number of the table.
displayName string The human readable name of the table. It SHOULD be used by the frontend tool to visualize the table's name in place of the name property. It's RECOMMENDED to not use the same displayName for different tables belonging to the same database.
description string The table descripion. CommonMark syntax. It MAY be used for rich text representation.
tableType string The table type. Admissible values are: EXTERNAL, VIEW, SECUREVIEW, MATERIALIZEDVIEW, ICEBERG, LOCAL, PARTITIONED.
columns [Column Object] The list of columns associated to the table.
constraints [Table Constraint Object] The table constraints (ex. referential integrity constraints, uniqueness constraints, etc...).
partitions [Table Partition Object] The information related to the table's partition if the table is partitioned.
tags [string] The list of tags associated to the table.
externalDocs External Resource Object Additional external documentation.

This object MAY be extended with Specification Extensions.

Schema Object Example
{
  "name": "sales_fact_dec_1998",
  "version": "1.0.0",
  "fullyQualifiedName": "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.sales_fact_dec_1998.1",
  "displayName": "Foodmart Sales Fact Table",
  "description": "The fact table that store all sales of 1998",
  "tableType": "LOCAL",
  "constraints": [
    {
      "constraintType": "PRIMARY_KEY",
      "columns": [
        "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.sales_fact_dec_1998.customer_id",
        "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.sales_fact_dec_1998.product_id"
      ]
    }, {
      "constraintType": "FOREIGN_KEY",
      "columns": [
        "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.sales_fact_dec_1998.customer_id",
        "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.customer.customer_id"
      ]
    }, {
      "constraintType": "FOREIGN_KEY",
      "columns": [
        "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.sales_fact_dec_1998.customer_id",
        "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.product.product_id"
      ]
    }
  ],
  "columns": [
    {
      "name": "customer_id",
      "fullyQualifiedName": "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.sales_fact_dec_1998.customer_id",
      "displayName": "Customer ID",
      "dataType": "INTEGER",
      "columnConstraint": "PRIMARY_KEY",
      "ordinalPosition": 1
    }, {
      "name": "product_id",
      "fullyQualifiedName": "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.sales_fact_dec_1998.product_id",
      "displayName": "Product ID",
      "dataType": "INTEGER",
      "columnConstraint": "PRIMARY_KEY",
      "ordinalPosition": 2
    }, {
      "name": "store_sales",
      "fullyQualifiedName": "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.sales_fact_dec_1998.store_sales",
      "displayName": "Store Sales",
      "dataType": "DECIMAL",
      "precision": "10",
      "scale": "4",
      "columnConstraint": "NOT NULL",
      "ordinalPosition": 3
    }, {
      "name": "store_cost",
      "fullyQualifiedName": "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.sales_fact_dec_1998.store_cost",
      "displayName": "Store Cost",
      "dataType": "DECIMAL",
      "precision": "10",
      "scale": "4",
      "columnConstraint": "NOT NULL",
      "ordinalPosition": 4
    }, {
      "name": "unit_sales",
      "fullyQualifiedName": "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.sales_fact_dec_1998.unit_sales",
      "displayName": "Store Cost",
      "dataType": "DECIMAL",
      "precision": "10",
      "scale": "4",
      "columnConstraint": "NOT NULL",
      "ordinalPosition": 5
    }
  ]
}
name: "sales_fact_dec_1998"
version: "1.0.0"
fullyQualifiedName: "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.sales_fact_dec_1998.1"
displayName: "Foodmart Sales Fact Table"
description: "The fact table that stores all sales of 1998"
tableType: "LOCAL"
constraints:
  - constraintType: "PRIMARY_KEY"
    columns:
      - "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.sales_fact_dec_1998.customer_id"
      - "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.sales_fact_dec_1998.product_id"
  - constraintType: "FOREIGN_KEY"
    columns:
      - "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.sales_fact_dec_1998.customer_id"
      - "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.customer.customer_id"
  - constraintType: "FOREIGN_KEY"
    columns:
      - "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.sales_fact_dec_1998.customer_id"
      - "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.product.product_id"
columns:
  - name: "customer_id"
    fullyQualifiedName: "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.sales_fact_dec_1998.customer_id"
    displayName: "Customer ID"
    dataType: "INTEGER"
    columnConstraint: "PRIMARY_KEY"
    ordinalPosition: 1
  - name: "product_id"
    fullyQualifiedName: "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.sales_fact_dec_1998.product_id"
    displayName: "Product ID"
    dataType: "INTEGER"
    columnConstraint: "PRIMARY_KEY"
    ordinalPosition: 2
  - name: "store_sales"
    fullyQualifiedName: "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.sales_fact_dec_1998.store_sales"
    displayName: "Store Sales"
    dataType: "DECIMAL"
    precision: "10"
    scale: "4"
    columnConstraint: "NOT NULL"
    ordinalPosition: 3
  - name: "store_cost"
    fullyQualifiedName: "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.sales_fact_dec_1998.store_cost"
    displayName: "Store Cost"
    dataType: "DECIMAL"
    precision: "10"
    scale: "4"
    columnConstraint: "NOT NULL"
    ordinalPosition: 4
  - name: "unit_sales"
    fullyQualifiedName: "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.sales_fact_dec_1998.unit_sales"
    displayName: "Store Cost"
    dataType: "DECIMAL"
    precision: "10"
    scale: "4"
    columnConstraint: "NOT NULL"
    ordinalPosition: 5

Table Constraint Object

The Table Constraint Object describes a constraint defined at table level.

Fixed Fields
Field Name Type Description
constraintType string Type of constraint. Admissible values are: UNIQUE, PRIMARY_KEY, FOREIGN_KEY.
columns [string:fqn] List of column fullyQualifiedNames corresponding to the constraint.
Table Constraint Object Example

The following shows an example of a composed primary key defined on columns customer_id and product_id.

{
  "constraintType": "PRIMARY_KEY",
  "columns": [
    "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.sales_fact_dec_1998.customer_id",
    "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.sales_fact_dec_1998.product_id"
  ]
}
constraintType: "PRIMARY_KEY"
columns:
  - "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.sales_fact_dec_1998.customer_id"
  - "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.sales_fact_dec_1998.product_id"

The following shows an example of a foreign key defined column customer_id that referentiates column customer_id of table customer.

{
  "constraintType": "FOREIGN_KEY",
  "columns": [
    "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.sales_fact_dec_1998.customer_id",
    "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.customer.customer_id"
  ]
}
constraintType: "FOREIGN_KEY"
columns:
  - "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.sales_fact_dec_1998.customer_id"
  - "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.customer.customer_id"

Table Partition Object

The Table Partition Object describes a constraint defined at table level.

Fixed Fields
Field Name Type Description
columns [string:fqn] List of column fullyQualifiedNames corresponding to the partition.
intervalType string type of partition interval. Admissible values are: TIME-UNIT, INTEGER-RANGE, INGESTION-TIME, COLUMN-VALUE.
interval string partition interval , example hourly, daily, monthly.

Column Object

The Column Object describes a column of a database's table.

Fixed Fields
Field Name Type Description
name string The local name (i.e. not fully qualified name) of the column. It is equal to - when the column is not named in struct dataType. For example, BigQuery supports struct with unnamed fields. In other cases, it MUST be unique within a table.
displayName string The human readable name of the column. It MAY be used by the frontend tool to visualize the column's name in place of the name property. It's RECOMMENDED to not use the same displayName for different columns belonging to the same table.
fullyQualifiedName string:fqn (REQUIRED) The fully qualified name of the column built by concatenation of table.fully-qualified-name and column.name. It is RECOMMENDED to use an unique universal idetifier of the form urn:dsas:{org-namespace}:tables:{table.fullyQualifiedName}:{column.name}. It's RECOMMENDED to use as org-namespace your company's domain name in reverse dot notation (es it.quantyca) in order to ensure that the fullyQualifiedName is unique universal idetifier. Example: "fullyQualifiedName": "urn:dsas:it.quantyca:tables:mysqld-prod:dwh:sales:1:productId". For inbound compatibility with Open Metadata v0.12.1 the fullyQualifiedName MAY also be in the simpler form of datastoreName.databaseName.tableName.columnName. Example: "fullyQualifiedName": "mysqld-prod.dwh.sales.productId".
description string Description of a column.
dataType string Data type of the column. Admissible values are: NUMBER, TINYINT, SMALLINT, INT, BIGINT, BYTEINT, BYTES, FLOAT, DOUBLE, DECIMAL, NUMERIC, TIMESTAMP, TIME, DATE, DATETIME, INTERVAL, STRING, MEDIUMTEXT, TEXT, CHAR, VARCHAR, BOOLEAN, BINARY, VARBINARY, ARRAY, BLOB, LONGBLOB, MEDIUMBLOB, MAP, STRUCT, UNION, SET, GEOGRAPHY, ENUM, JSON
dataLength integer Length of CHAR, VARCHAR, BINARY, VARBINARY dataTypes, else null. For example, VARCHAR(20) has dataType as VARCHAR and dataLength as 20.
precision integer The precision of a numeric is the total count of significant digits in the whole number, that is, the number of digits to both sides of the decimal point. Precision is applicable integer types, such as INT, SMALLINT, BIGINT, etc. It also applies to other Numeric types, such as NUMBER, DECIMAL, DOUBLE, FLOAT, etc.
scale integer The scale of a numeric is the count of decimal digits in the fractional part, to the right of the decimal point. For integer types, the scale is 0. It mainly applies to non-integer numeric types, such as NUMBER, DECIMAL, DOUBLE, FLOAT, etc.
jsonSchema string The JSON Schema of the column only if the dataType is equals to JSON, else null.
columnConstraint string The column level constraint. Admissible values are: NULL, NOT_NULL, UNIQUE, PRIMARY_KEY.
ordinalPosition integer The ordinal position of the column in the table.
Column Object Example
{
  "name": "customer_id",
  "fullyQualifiedName": "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.customer.customer_id",
  "displayName": "Customer ID",
  "dataType": "INTEGER",
  "columnConstraint": "PRIMARY_KEY",
  "ordinalPosition": 1
}
name: "customer_id"
fullyQualifiedName: "urn:dsas:it.quantyca:tables:foodmart.foodmartdb.dwh.customer.customer_id"
displayName: "Customer ID"
dataType: "INTEGER"
columnConstraint: "PRIMARY_KEY"
ordinalPosition: 1

Components Object

The Components Object holds a set of reusable objects for different aspects of the API. All objects defined within the components object will not affect the Datastore API unless they are explicitly referenced from properties outside the components object.

Fixed Fields
Field Name Type Description
serverInfo Map[string, Server Info Object | Reference Object] An object to hold reusable Server Info Object.
tables Map[string, Table Object | Reference Object] An object to hold reusable Table Object.

This object MAY be extended with Specification Extensions.

All the fixed fields declared above are objects that MUST use keys that match the regular expression: ^[a-zA-Z0-9\.\-_]+$.

Reference Object

The Reference Object allows referencing other components in the Datastore API Document, internally and externally.

The $ref string value contains a URI RFC3986, which identifies the location of the value being referenced.

See the rules for resolving Relative References.

Fixed Fields
Field Name Type Description
description string A description which by default SHOULD override that of the referenced component. CommonMark syntax MAY be used for rich text representation. If the referenced object type does not allow a description field, then this field has no effect.
mediaType string The media type of referenced object. It must conform to media type format, according to RFC6838.
$ref string (REQUIRED) The reference identifier. This MUST be in the form of a URI.

This object cannot be extended with additional properties and any properties added SHALL be ignored.

Reference Object Example
{
	"$ref": "#/components/schemas/Pet"
}
$ref: "#/components/schemas/Pet"
Relative Schema Document Example
{
  "$ref": "Pet.json"
}
 "$ref": "Pet.json"
Relative Documents With Embedded Schema Example
{
  "$ref": "definitions.json#/Pet"
}
"$ref": "definitions.json#/Pet"

External Resource Object

The External Resource Object allows referencing an external resource like a documentation page or a standard definition.

Fixed Fields
Field Name Type Description
description string A description of the target resource. CommonMark syntax MAY be used for rich text representation.
mediaType string The media type of target resource. It must conform to media type format, according to RFC6838.
$href string:uri (REQUIRED) The URI of the target resource. It must conform to the URI format, according to RFC3986.

This object cannot be extended with additional properties and any properties added SHALL be ignored.

External Resource Object Example
{
  "description": "Find more info here",
  "mediaType": "text",
  "$href": "https://example.com"
}
description: "Find more info here"
mediaType: "text"
$href: "https://example.com"

Standard Definition Object

The Standard Definition Object formally describes an object (ex. table schema, etc ...) of interest following a given standard specification.

Fixed Fields
Field Name Type Description
id string:uuid (READONLY) It's an UUID of the definition. It is valorized on server side when the object can be reused in another context (ex. a definition of a table schema used in multiple APIs). It is RECOMMENDED to use a UUID version 3 (RFC-4122) generated as SHA-1 hash of the concatenation of name and version separated by :.
name string:name The name of the defined object. It is valorized when the object can be reused in other contexts (ex. a definition of a table schema used in multiple APIs). It's RECOMMENDED to use a camel case formatted string.
version string The version of the defined object. t is valorized when the object can be reused in another context (ex. a definition of a table schema used in multiple APIs).
description string The standard definition descripion. CommonMark syntax MAY be used for rich text representation.
specification string (REQUIRED) The external specification used in the definition.
specificationVersion string The version of the external specification used in the definition. If not defined the version MUST be included in the definition itself.
definition object | string | Reference Object (REQUIRED) The formal definition built using the spcification declared in the [specification](#standardDefinitionSpecification) field.
externalDocs External Resource Object Additional external documentation for the standard definition.

This object MAY be extended with Specification Extensions.

Standard Definition Object Example:
{
  "specification": "schemata",
  "specificationVersion": "1",
  "definition": {
    "mediaType": "application/x-protobuf",
    "$ref": "trip-status.proto"
  }   
} 
specification: "schemata"
specificationVersion: "1"
definition:
  mediaType: "application/x-protobuf"
  $ref: "trip-status.proto"

Specification Extensions

While the Datastore API Specification tries to accommodate most use cases, additional data can be added to extend the specification at certain points. The extension properties are implemented as patterned fields that are always prefixed by x-.

Field Pattern Type Description
^x- Any Allows extensions to the Data Product Descriptor Schema. The field name MUST begin with x-, for example, x-internal-id. The value can be null, a primitive, an array, or an object. Can have any valid JSON format value.

The extensions may or may not be supported by the available tooling, but those may be extended as well to add requested support (if tools are internal or open-sourced).

Appendix A: Revision History

Version Date Notes
1.0.0 2024-Q4 Release of the Datastore API Specification 1.0.0
1.0.0-DRAFT 2022-Q1 Release of the Datastore API Specification 1.0.0-DRAFT