Skip to content

Conversation

@oguzhanunlu
Copy link
Member

@oguzhanunlu oguzhanunlu commented Oct 24, 2025

ref: https://snplow.atlassian.net/browse/PDP-2203

Upgrade json-schema-validator from 1.0.76 to 1.5.8

Context

The Self-Serve team needs to validate user-submitted configs against schemas in Iglu Central. This functionality exists in Iglu Scala Client but cannot be used due to a dependency conflict.

Console uses com.networknt.json-schema-validator version 1.5.8 (released 2024-06-27), while Iglu Scala Client uses version 1.0.76 (released 2022-12-19).

This PR upgrades Iglu Client to use the newer version, enabling the Self-Serve team to leverage existing schema fetching and validation functionality.

Changes

1. Dependency Updates

Dependency Old Version New Version Reason
json-schema-validator 1.0.76 1.5.8 Main upgrade to resolve conflict
jackson-databind 2.14.1 2.18.3 Compatibility requirement for validator 1.5.8
commons-lang3 (transitive) 3.17.0 (explicit) Made explicit for better dependency management
sbt 1.8.2 1.11.7 Required for Sonatype Central Portal migration

2. Code Changes in CirceValidator

The major version jump required handling several breaking API changes:

Schema Loading API

  • Old (1.0.76): URIFetcher with fetch(URI): InputStream
  • New (1.5.8): SchemaLoader with getSchema(AbsoluteIri): InputStreamSource

JsonMetaSchema Builder

  • Old: .addKeyword(), .addMetaSchema(), .uriFetcher()
  • New: .keyword(), .metaSchema(), .schemaLoaders()

Schema Factory Configuration

  • Removed deprecated methods: .forceHttps(), .removeEmptyFragmentSuffix()
  • Migrated SchemaValidatorsConfig to builder pattern

ValidationMessage API

  • Old: m.getPath (returns String), m.getArguments (returns raw list)
  • New: m.getInstanceLocation() (returns nullable, needs .toString()), m.getArguments requires .map(_.toString)

3. Validation Error Format Changes

⚠️ Breaking changes in error output format:

Aspect Old Format (1.0.76) New Format (1.5.8)
Path format $.field /field
Array paths $.array[0] /array/0
Path standard JSONPath JSON Pointer (RFC 6901)
Error wording "may only be 3 characters long" "must be at most 3 characters long"
Validation args Limited context Includes actual values

Example:

// Old: "$.address: does not match the ipv4 pattern ^(...regex...)$"
// New: "/address: does not match the ipv4 pattern must be a valid RFC 2673 IP address"

4. Test Updates

All test assertions updated to match new error formats.

5. Build Tool Upgrade

Upgraded sbt from 1.8.2 to 1.11.7 to support migration to Sonatype's Central Portal.

⚠️ Important Note: Sonatype has sunset the Legacy OSSRH endpoint for publishing on 2025-06-30. To continue publishing to the Central Repository, sbt 1.11.x or later and sbt-ci-release 1.11.0 or later are required.

Backward Compatibility

✅ Public API: CirceValidator's public API remains unchanged

⚠️ Error Messages: Applications parsing error messages or paths will need updates

Release Notes Analysis

Release notes from 1.0.76 to 1.5.8 have been analyzed. Key findings:

  • API modernization (builder patterns, better null safety)
  • Improved error messages with JSON Pointer standard
  • Enhanced Jackson compatibility
  • No identified issues that would harm the application
  • Security model maintained (no-op schema loader still prevents network calls)

This upgrade resolves dependency conflicts with Console which uses
json-schema-validator 1.5.8, enabling the Self-Serve team to leverage
Iglu Scala Client for schema validation of user-submitted configs.

Dependency changes:
- json-schema-validator: 1.0.76 → 1.5.8
- jackson-databind: 2.14.1 → 2.18.3 (compatibility requirement)
- commons-lang3: 3.17.0 (made explicit, was transitive dependency)

Breaking API changes handled:
- Migrated from URIFetcher to SchemaLoader API for preventing network calls
- Updated JsonMetaSchema builder methods (addKeyword→keyword, addMetaSchema→metaSchema)
- Replaced deprecated JsonSchemaFactory methods (forceHttps, removeEmptyFragmentSuffix)
- Updated SchemaValidatorsConfig to use builder pattern
- Updated ValidationMessage path extraction (getPath→getInstanceLocation with null handling)

Validation error format changes:
- Path format: $.field → /field (JSONPath → JSON Pointer RFC 6901)
- Array paths: $.array[0] → /array/0
- Error message wording improvements (e.g., "may only be" → "must be at most")
- Validation arguments now properly converted to strings

All tests updated to match new error message format while maintaining
100% backward compatibility of CirceValidator's public API.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
ValidatorReport(
"$.country: integer found, string expected",
Some("$.country"),
"/country: integer found, string expected",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a very significant change for Snowplow.

Am I right that this validator report is what gets returned to our customers via bad rows? And therefore what gets displayed to users in the snowplow console?

I am a big fan of upgrading this library to the latest version. But if we press ahead with this change, then we need to flag this change in behaviour to our Product team and possibly other Engineering teams too.

I assume there's no option to keep the old style validation message??

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replying for the record -- I pushed new commits to bring back legacy json path but there is nothing I can do about message wording, it is updated as static content

Updates json-schema-validator 1.5.8 upgrade to maintain JSONPath format
($.field) instead of switching to JSON Pointer format (/field) by
configuring PathType.JSON_PATH in SchemaValidatorsConfig.

Changes:
- Added PathType.JSON_PATH configuration to SchemaValidatorsConfig
- Imported com.networknt.schema.PathType
- Updated test expectations to match new error message wording while
  preserving $.field path format

Error message format changes (unavoidable, hardcoded in library):
- Required fields: "$.field: is missing but it is required" →
  "$: required property 'field' not found"
- maxLength: "may only be X characters long" → "must be at most X characters long"
- additionalProperties: "$.parent.field: is not defined" →
  "$.parent: property 'field' is not defined"
- Format validation: Now includes RFC references and actual invalid values
- Schema validation: Enum values now properly quoted in error messages

Path format preservation:
✅ Maintained $.field format (not /field) for parsing compatibility
⚠️ Error message wording changed (improvements, but breaking for exact string matching)

The PathType.JSON_PATH configuration ensures JsonNodePath.toString()
returns JSONPath format, preserving the most critical aspect for
downstream error parsing while accepting improved error message clarity.

All 110 tests passing with backward-compatible path format.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
oguzhanunlu and others added 5 commits October 24, 2025 15:25
This upgrade is necessary to support migration to Sonatype's Central Portal.

Note: Sonatype has sunset the Legacy OSSRH endpoint on 2025-06-30.
To continue publishing to the Central Repository, sbt 1.11.x or later
and sbt-ci-release 1.11.0 or later are required.

Ref: #265
- Upgrade sbt-scoverage from 2.0.4 to 2.2.2
  Fixes: Error downloading scalac-scoverage-plugin_2.12.20:2.0.5
  The older version is not compatible with sbt 1.11.7

- Upgrade sbt-ci-release from 1.5.7 to 1.9.2
  Required for publishing to Sonatype Central Portal (1.11.0+)
  as per Sonatype's sunset of Legacy OSSRH endpoint

These upgrades ensure compatibility with sbt 1.11.7 and support
for the new Central Portal publishing workflow.
Changed from com.geirsson:sbt-ci-release to com.github.sbt:sbt-ci-release.

The plugin changed maintainers and organization ID from com.geirsson
to com.github.sbt starting with version 1.6.0.
Version 2.2.2 does not have scalac-scoverage-plugin artifacts for
Scala 2.13.9. Using 2.1.1 which supports all Scala versions used
in this project (2.12.17, 2.13.9, 3.2.0) and is compatible with
sbt 1.11.7.
@gthomson31
Copy link

gthomson31 commented Oct 26, 2025

Build Tool Updates

I've added commits to upgrade the build tooling to support publishing to Sonatype's Central Portal:

Changes Made

  1. sbt upgrade (1.8.21.11.7)

    • Required for Central Portal migration
    • Sonatype has sunset the Legacy OSSRH endpoint as of 2025-06-30
  2. sbt-scoverage upgrade (2.0.42.4.0)

    • Fixes compatibility with sbt 1.11.7
    • Version 2.4.0 includes support for Scala 2.12.20 and fallback support for older versions
  3. sbt-ci-release upgrade (1.5.71.9.2)

    • Required version 1.11.0+ for Central Portal publishing
    • Updated artifact coordinates from com.geirsson to com.github.sbt (plugin changed maintainers in v1.6.0)
  4. Scala version upgrades (for SIP-51 compliance with sbt 1.11.7)

    • Scala 2.13: 2.13.92.13.16
    • Scala 2.12: 2.12.172.12.19
    • kind-projector: 0.13.20.13.3

Why This Matters

Starting June 30, 2025, the Legacy OSSRH endpoint will no longer be available for publishing. These upgrades ensure that future releases can be published to the Central Repository through the new Central Portal.

sbt 1.11.7 enforces SIP-51 (backwards-only binary compatibility), which requires the Scala compiler version to match or exceed the scala-library version on the classpath. The upgraded dependencies (particularly sbt-scoverage 2.4.0) pull in scala-library 2.13.16+, necessitating the Scala version upgrades.

Testing

✅ All changes have been tested locally:

  • sbt update - All dependencies resolve successfully
  • sbt compile - All modules compile without errors across all Scala versions
  • sbt "+ test" - All 121 tests pass across Scala 2.12.19/2.12.20, 2.13.16, and 3.2.0

References:

The CI build should now pass with these updates.

- Scala 2.13: 2.13.9 → 2.13.14
- Scala 2.12: 2.12.17 → 2.12.19
- kind-projector: 0.13.2 → 0.13.3

sbt 1.11.7 enforces SIP-51 (backwards-only binary compatibility),
which requires the Scala compiler version to match or exceed the
scala-library version on the classpath. The upgraded dependencies
were pulling in scala-library 2.13.10+, causing build failures.

These versions are the latest stable releases that:
- Support all required compiler plugins (kind-projector, better-monadic-for)
- Satisfy sbt 1.11.7's SIP-51 compatibility checks
- Pass all tests (110 examples, 0 failures)
sbt 1.11.7 automatically upgrades Scala 2.12.19 to 2.12.20 for the
build meta-project, but sbt-scoverage 2.1.1 doesn't have artifacts
for Scala 2.12.20.

sbt-scoverage 2.4.0 (released October 2024) includes:
- Support for Scala 2.12.20
- Fallback support for older Scala versions
- Updates to Scalac plugin and Scala versions

Tested across all Scala versions:
- Scala 2.12.19/2.12.20: 121 tests pass
- Scala 2.13.14: 121 tests pass
- Scala 3.2.0: 121 tests pass
The sbt-scoverage 2.4.0 upgrade transitively pulls in dependencies
that require scala-library 2.13.16, triggering SIP-51 compatibility
checks that mandate the compiler version match or exceed the library
version on the classpath.

This upgrades from 2.13.14 to 2.13.16 to satisfy these requirements.

All tests pass with the new version (121 tests, 118 passed, 3 skipped).
Changes:
- Split workflow into two jobs: 'docs' and 'release'
- Use matrix strategy to deploy modules in parallel
- Reduces duplication and makes the workflow more maintainable

Benefits:
- All three modules (data, core, http4s) now publish in parallel
- Easier to add new modules in the future
- Single source of truth for deployment configuration
CI Workflow:
- Split into 2 jobs: ci (matrix) and coverage
- Matrix strategy runs 4 tasks in parallel:
  * Run tests across all Scala versions
  * Check Scala formatting
  * Check binary compatibility
  * Check assets can be published
- Coverage job runs after all checks complete

Release Workflow:
- Split into 2 jobs: docs and release (matrix)
- Matrix strategy deploys 3 modules in parallel:
  * data
  * core
  * http4s

Benefits:
- Faster CI: All checks run concurrently (~75% faster)
- Faster releases: All modules publish simultaneously
- Better failure isolation: See exactly which check fails
- More maintainable: Easy to add new checks or modules
This version includes full support for Sonatype Central Portal publishing
and resolves authentication issues with the new credentials.

Refs: https://github.com/sbt/sbt-ci-release/releases/tag/v1.11.0
@gthomson31
Copy link

Update: Additional Changes for Central Portal Support

Following initial testing, I've made additional updates to complete the Sonatype Central Portal migration:

Additional Plugin Upgrade

  • sbt-ci-release: 1.9.21.11.2 (adds full Central Portal publishing support)

Credentials Updated

  • Updated GitHub Actions secrets (SONA_USER, SONA_PASS) with new "Maven Central - Deployment Key" credentials
  • New credentials retrieved from Keeper and applied to repository secrets
  • Required for authentication with Sonatype Central Portal (old Legacy OSSRH credentials no longer work)

Testing Status

  • ✅ Local builds: All tests pass (121 tests across all Scala versions)
  • ✅ Formatting, binary compatibility, and publish checks: All pass
  • 🔄 Release workflow: Re-triggered with tag 4.1.0-M1 to validate Central Portal publishing with new credentials

Original comment summary:

  • sbt: 1.8.21.11.7
  • sbt-scoverage: 2.0.42.4.0
  • sbt-ci-release: 1.5.71.11.2 (updated from 1.9.2)
  • Scala 2.13: 2.13.92.13.16
  • Scala 2.12: 2.12.172.12.19
  • kind-projector: 0.13.20.13.3
  • Workflow optimizations: Matrix strategies for parallel CI and releases

@gthomson31
Copy link

gthomson31 commented Oct 27, 2025

Credential Rollout Across Repositories

Note: IT Services will roll out the new Maven Central credentials to all repositories publishing to Maven Central.

📋 Tracking ticket: ITS-1792

This ticket documents the process to:

  • Identify all repositories using Maven Central publishing
  • Update SONA_USER and SONA_PASS secrets with new "Maven Central - Deployment Key" credentials from Keeper
  • Apply updates via Vault and Terraform
  • Validate credentials work with Sonatype Central Portal

All Snowplow repositories publishing to Maven Central will need similar build tool upgrades (sbt 1.11.x, sbt-ci-release 1.11.0+, Scala version updates per SIP-51) to work with the new Central Portal endpoint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants