Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/build-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ env:
# modules commonly excluded from builds as they have their own independent non-JVM setups and can be run in parallel.
# take care when modifying this list because GLVs use shell commands to remove themselves from this list and
# modifications could break patterns of replacement they are searching for.
EXCLUDE_MODULES: '-:gremlin-dotnet-source,-:gremlin-dotnet-tests,-:gremlin-go,-:gremlin-javascript,-:gremlint,-:gremlin-python'
EXCLUDE_MODULES: '-:gremlin-dotnet-source,-:gremlin-dotnet-tests,-:gremlin-go,-:gremlin-javascript,-:gremlint,-:gremlin-mcp,-:gremlin-python'
EXCLUDE_FOR_GLV: '-:gremlin-annotations,-:gremlin-archetype,-:gremlin-console,-:hadoop-gremlin,-:neo4j-gremlin,-:spark-gremlin,-:sparql-gremlin'
jobs:
smoke:
Expand Down Expand Up @@ -237,7 +237,7 @@ jobs:
run: |
EXCLUDE="-:gremlin-dotnet-source,-:gremlin-dotnet-tests,-:gremlin-go,-:gremlin-python,$EXCLUDE_FOR_GLV"
mvn clean install -pl $EXCLUDE -q -DskipTests -Dci
mvn verify -pl :gremlin-javascript,:gremlint
mvn verify -pl :gremlin-javascript,:gremlint,:gremlin-mcp
python:
name: python
timeout-minutes: 20
Expand Down
1 change: 1 addition & 0 deletions CHANGELOG.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ image::https://raw.githubusercontent.com/apache/tinkerpop/master/docs/static/ima

This release also includes changes from <<release-3-7-XXX, 3.7.XXX>>.

* Added a Gremln MCP server.
* Added the Air Routes dataset to the set of available samples packaged with distributions.
* Added a minimal distribution for `tinkergraph-gremlin` using the `min` classifier that doesn't include the sample datasets.
* Removed Vertex/ReferenceVertex from grammar. Use vertex id in traversals now instead.
Expand Down
24 changes: 16 additions & 8 deletions docs/src/dev/developer/development-environment.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -309,17 +309,25 @@ See the <<release-environment,Release Environment>> section for more information
[[nodejs-environment]]
=== JavaScript Environment

When building `gremlin-javascript`, mvn command will include a local copy of Node.js runtime and npm inside your project
using `com.github.eirslett:frontend-maven-plugin` plugin. This copy of the Node.js runtime will not affect any
other existing Node.js runtime instances in your machine.
When building `gremlin-javascript`, `gremlint` and `gremlin-mcp`, the `mvn` command will include a local copy of Node.js
runtime and npm inside your project using `com.github.eirslett:frontend-maven-plugin` plugin. This copy of the Node.js
runtime will not affect any other existing Node.js runtime instances in your machine.

To run the development and build scripts of `gremlint` and its corresponding web page `docs/gremlint`, Node.js and npm
have to be installed. When generating or publishing the TinkerPop website, the `docs/gremlint` web page has to be
To run the development and build scripts of the web app in `docs/gremlint`, Node.js and npm have to be installed on the
local system at this time. When generating or publishing the TinkerPop website, the `docs/gremlint` web page has to be
built. Consequently, the scripts `bin/generate-home.sh` and `bin/publish-home.sh` require that Node.js and npm are
installed. Version 8.x or newer of npm is required. This is covered in more detail in the <<site,Site>> section.
installed. Check the root `pom.xml` for the `runtime.node.version` property for the minimum version required. This is
covered in more detail in the <<site,Site>> section.

As of TinkerPop 3.5.5, `gremlin-javascript` uses Docker for all tests inside of Maven. Please make sure Docker is
installed and running on your system.
A fast way to test `grelin-mcp` after doing a build is to use link:https://modelcontextprotocol.io/docs/tools/inspector[@modelcontextprotocol/inspector]
which will start up the Gremlin MCP server and present a browser-based tool to use the commands. It is most easily
launched with `npx` as follows:

[source,text]
----
# from the root of the repository
$ npx @modelcontextprotocol/inspector node gremlin-mcp/src/main/javascript/dist/server.js -e GREMLIN_MCP_ENDPOINT=localhost:8182/g -e GREMLIN_MCP_LOG_LEVEL=info
----

IMPORTANT: Beware of unexpected or unwanted changes on `package-lock.json` files when committing and merging.

Expand Down
123 changes: 119 additions & 4 deletions docs/src/reference/gremlin-applications.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -2250,7 +2250,7 @@ scrubbing, it would be quite simple to do:

[source,java]
----
String lbl = "person"
String lbl = "person";
String nodeId = "mary').next();g.V().drop().iterate();g.V().has('id', 'thomas";
String query = "g.addV('" + lbl + "').property('identifier','" + nodeId + "')";
client.submit(query);
Expand All @@ -2262,13 +2262,13 @@ part of the "identifier" for the vertex on insertion:

[source,java]
----
String lbl = "person"
String lbl = "person";
String nodeId = "mary').next();g.V().drop().iterate();g.V().has('id', 'thomas";
String query = "g.addV(lbl).property('identifier',nodeId)";

Map<String,Object> params = new HashMap<>();
params.put("lbl",lbl);
params.put("nodeId",nodeId);
params.put("lbl", lbl);
params.put("nodeId", nodeId);
client.submit(query, params);
----

Expand Down Expand Up @@ -3009,3 +3009,118 @@ for `HadoopGraph`:
----
describeGraph(HadoopGraph)
----

[[gremlin-mcp]]
=== Gremlin MCP

Gremlin MCP integrates Apache TinkerPop with the Model Context Protocol (MCP) so that MCP‑capable assistants (for
example, desktop chat clients that support MCP) can discover your graph, run Gremlin traversals and exchange graph data
through a small set of well‑defined tools. It allows users to “talk to your graph” while keeping full Gremlin power
available when they or the assistant need it.

MCP is an open protocol that lets assistants call server‑hosted tools in a structured way. Each tool has a name, an
input schema, and a result schema. When connected to a Gremlin MCP server, the assistant can:

* Inspect the server’s health and connection to a Gremlin data source
* Discover the graph’s schema (labels, properties, relationships, counts)
* Execute Gremlin traversals

The Gremlin MCP server sits alongside Gremlin Server (or any TinkerPop‑compatible endpoint) and forwards tool calls to
the graph via standard Gremlin traversals.

IMPORTANT: This MCP server is designed for development and trusted environments.

WARNING: Gremlin MCP can modify the graph to which it is connected. To prevent such changes, ensure that Gremlin MCP is
configured to work against a read-only instance of the graph. Gremlin Server hosted graphs can configure their graph
using `withStrategies(ReadOnlyStrategy)` for that protection.

WARNING: Gremlin MCP executes global graph traversals to help it understand the schema and gather statistics. On a large
graph these queries will be costly. If you are trying Gremlin MCP, please try it with a smaller subset of your graph for
experimentation purposes.

MCP defines a simple request/response model for invoking named tools. A tool declares its input and output schema so an
assistant can construct valid calls and reason about results. The Gremlin MCP server implements several tools and, when
invoked by an MCP client, translates those calls to Gremlin traversals against a configured Gremlin endpoint. The
endpoint is typically Gremlin Server, but could be used with any graph system that implements its protocols.

TIP: Gremlin MCP does not replace Gremlin itself. It complements it by helping assistants discover data and propose
traversals. You can always provide an explicit traversal when you know what you want.

The Gremlin MCP server exposes these tools:

* `get_graph_status` — Returns basic health and connectivity information for the backing Gremlin data source.
* `get_graph_schema` — Discovers vertex labels, edge labels, property keys, and relationship patterns. Low‑cardinality
properties may be surfaced as enums to encourage valid values in queries.
* `run_gremlin_query` — Executes an arbitrary Gremlin traversal and returns JSON results.
* `refresh_schema_cache` — Forces schema discovery to run again when the graph has changed.

==== Schema discovery

Schema discovery is the foundation that lets humans and AI assistants reason about a graph without prior tribal
knowledge. By automatically mapping the graph’s structure and commonly observed patterns, it produces a concise,
trustworthy description that accelerates onboarding, improves the quality of suggested traversals, and reduces
trial‑and‑error against production data. For assistants, a discovered schema becomes the guidance layer for planning
valid queries, generating meaningful filters, and explaining results in natural language. For operators, it offers safer
and more efficient interactions by avoiding blind exploratory scans, enabling caching and change detection, and
providing hooks to steer what should or shouldn’t be surfaced (for example, excluding sensitive or non‑categorical
fields). In short, schema discovery turns an opaque dataset into an actionable contract between your graph and the tools

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

excluding sensitive or non‑categorical fields

How does it make that determination? It might be worth adding examples, such as id.

that use it.

Schema discovery uses Gremlin traversals and sampling to uncover the following information about the graph:

* Labels - Vertex and edge labels are collected and de‑duplicated.
* Properties - For each label, a sample of elements is inspected to list observed property keys.
* Counts (optional) - Approximate counts can be included per label.
* Relationship patterns - Connectivity is derived from the labels of edges and their incident vertices.
* Enums - Properties with a small set of distinct values may be surfaced as enumerations to promote precise filters.

==== Executing traversals

When the assistant needs to answer a question, a common sequence is:

. Optionally, call get_graph_status.
. Retrieve (or reuse) schema via `get_graph_schema`.
. Formulate a traversal and call `run_gremlin_query`.
. Present results and, if required, refine the traversal.

For example, the assistant may execute a traversal like the following:

[source,groovy]
----
// list the names of people over 30 and who they know
g.V().hasLabel('person').has('age', gt(30)).out('knows').values('name')
----

==== Configuring an MCP Client

The MCP client is responsible for launching the Gremlin MCP server and providing connection details for the Gremlin
endpoint the server should use.

Basic connection settings:

* `GREMLIN_MCP_ENDPOINT` — `host:port` or `host:port/traversal_source` for the target Gremlin Server or compatible endpoint (default traversal source: `g`)
* `GREMLIN_MCP_USE_SSL` — set to `true` when TLS is required by the endpoint (default: `false`)
* `GREMLIN_MCP_USERNAME` / `GREMLIN_PASSWORD` — credentials when authentication is enabled (optional)
* `GREMLIN_MCP_IDLE_TIMEOUT` — idle connection timeout in seconds (default: `300`)
* `GREMLIN_MCP_LOG_LEVEL` — logging verbosity for troubleshooting: `error`, `warn`, `info`, or `debug` (default: `info`)

Advanced schema discovery and performance tuning:

* `GREMLIN_MCP_ENUM_DISCOVERY_ENABLED` — enable enum property discovery (default: `true`)
* `GREMLIN_MCP_ENUM_CARDINALITY_THRESHOLD` — max distinct values for a property to be considered an enum (default: `10`)
* `GREMLIN_MCP_ENUM_PROPERTY_DENYLIST` — comma-separated property names to exclude from enum detection (default: `id,pk,name,description,startDate,endDate,timestamp,createdAt,updatedAt`)
* `GREMLIN_MCP_SCHEMA_MAX_ENUM_VALUES` — limit the number of enum values returned per property in the schema (default: `10`)
* `GREMLIN_MCP_SCHEMA_INCLUDE_SAMPLE_VALUES` — include small example values for properties in the schema (default: `false`)
* `GREMLIN_MCP_SCHEMA_INCLUDE_COUNTS` — include approximate vertex/edge label counts in the schema (default: `false`)

The configurations related to enums begs additional explanation as to their importance. Treating only truly categorical
properties as enums prevents misleading suggestions and sensitive data exposure in assistant‑facing schemas. Without a
denylist and related controls, low‑sample snapshots can make non‑categorical fields like IDs, timestamps, or free text
appear “enum‑like,” degrading query guidance and result explanations. By explicitly excluding such keys, the schema
remains focused on meaningful categories (e.g., status or type), which improves AI query formulation, reduces noise, and
avoids surfacing unstable or private values. It also streamlines schema discovery by skipping properties that would
create large or frequently changing value sets, improving performance and stability.

Consult the MCP client documentation for how environment variables are supplied and how tool calls are approved and
presented to the user.

10 changes: 10 additions & 0 deletions docs/src/upgrade/release-3.8.x.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,16 @@ complete list of all the modifications that are part of this release.

=== Upgrading for Users

==== Gremlin MCP Server

Gremlin MCP Server is an experimental application that implements the link:https://modelcontextprotocol.io/[Model Context Protocol]
(MCP) to expose Gremlin Server-backed graph operations to MCP-capable clients such as Claude Desktop, Cursor, or
Windsurf. Through this integration, graph structure can be discovered, and Gremlin traversals can be executed. Basic
health checks are included to validate connectivity.

A running Gremlin Server that fronts the target TinkerPop graph is required. An MCP client can be configured to connect
to the Gremlin MCP Server endpoint.

==== Air Routes Dataset

The Air Routes sample dataset has long been used to help showcase and teach Gremlin. Popularized by the first edition
Expand Down
10 changes: 4 additions & 6 deletions gremlin-javascript/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,6 @@ limitations under the License.
<properties>
<maven.test.skip>false</maven.test.skip>
<skipTests>${maven.test.skip}</skipTests>
<npm.version>10.8.2</npm.version>
<node.version>v20.19.4</node.version>
</properties>
<build>
<directory>${basedir}/target</directory>
Expand Down Expand Up @@ -184,8 +182,8 @@ limitations under the License.
</executions>
<configuration>
<workingDirectory>src/main/javascript/gremlin-javascript</workingDirectory>
<nodeVersion>${node.version}</nodeVersion>
<npmVersion>${npm.version}</npmVersion>
<nodeVersion>${runtime.node.version}</nodeVersion>
<npmVersion>${runtime.npm.version}</npmVersion>
</configuration>
</plugin>
<!--
Expand Down Expand Up @@ -353,8 +351,8 @@ limitations under the License.
-->
<skip>false</skip>
<workingDirectory>src/main/javascript/gremlin-javascript</workingDirectory>
<nodeVersion>${node.version}</nodeVersion>
<npmVersion>${npm.version}</npmVersion>
<nodeVersion>${runtime.node.version}</nodeVersion>
<npmVersion>${runtime.npm.version}</npmVersion>
</configuration>
</plugin>
</plugins>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,18 @@
"name": "gremlin",
"version": "3.8.0-alpha1",
"description": "JavaScript Gremlin Language Variant",
"author": "Apache TinkerPop team",
"author": {
"name": "Apache TinkerPop team"
},

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add the url as well?

https://tinkerpop.apache.org/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

homepage already has the main url. i'm not sure what's normal there.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what is typical. I wasn't aware the author field could contain other values in an object, so I looked up the spec. The only other field that looked useful here was url, unless there's an Apache Tinkerpop email address.

Not including the url here is totally fine. I was just checking in case it was overlooked.

"keywords": [
"graph",
"gremlin",
"tinkerpop",
"apache-tinkerpop",
"connection",
"glv",
"driver",
"database",
"graphdb"
],
"license": "Apache-2.0",
Expand Down
Loading
Loading