Skip to content

A tool to efficiently migrate from CockroachDB to Spanner

License

Notifications You must be signed in to change notification settings

storj/spanner-migration

Repository files navigation

CockroachDB to Spanner Migration Tool

Note: This tool was developed for Storj Labs' internal migration needs and is being open-sourced to potentially benefit others. It successfully migrated 8 large-scale databases in late 2024 and early 2025. However, it currently requires polishing, generalization, additional testing, and documentation before it would be suitable for general use. We're sharing it in its current state for those who might find it useful as a reference or starting point for their own migration projects. As our migration is complete, we don't anticipate significant ongoing development.

A specialized tool developed by Storj Labs for migrating databases from CockroachDB to Google Cloud Spanner with a focus on reliability and data integrity.

Features

  • Data streaming: Captures data changes from CockroachDB using changefeeds
  • Reliable transfer: Processes changes through Google PubSub
  • Parallel processing: Configurable worker pools for efficient batch processing
  • Validation: Tools to verify migration correctness and compare data between sources
  • Metrics: Collection and monitoring of migration progress
  • Graceful shutdown: Support for safely stopping and resuming migrations

Architecture

The migration process works through these main components:

  1. Changefeed: Manages CockroachDB changefeeds to capture data changes
  2. PubSub: Routes change events through Google PubSub
  3. Worker Pool: Processes events in batches
  4. Spanner Writer: Persists changes to the target Spanner database
  5. Validation: Compares source and target data for integrity

Prerequisites

  • Access to source CockroachDB database
  • Google Cloud project with Spanner instance
  • Google Cloud service account with appropriate permissions
  • Google Cloud PubSub setup

Configuration

Create a YAML configuration file (default: migration.yaml) or set the environment variable STORJ_MIGRATION_CONFIG:

cockroach: postgresql://USER:PASSWORD@your-cockroach-host:26257/{{DB}}?sslmode=verify-full&sslrootcert=/path/to/ca.crt
credential: /path/to/service-account.json
spanner: projects/your-project/instances/your-instance/databases/{{DB}}
project: your-gcp-project
topic: prefix-{{TABLE}}
prefix: your-prefix

Usage

Process changefeeds

# Process changefeeds from PubSub to Spanner for a specific table
./spanner-migration process --table your_table \
  --batch-size 2500 \
  --workers 24

Manage changefeeds

# Create a changefeed
./spanner-migration changefeed create --table your_table

# List running changefeeds
./spanner-migration changefeed list

# Cancel a changefeed
./spanner-migration changefeed cancel --id your_changefeed_id

Validate data

# Compare data between CockroachDB and Spanner
./spanner-migration validation compare --table your_table

Available Commands

  • process: Process changefeeds from PubSub to Spanner
  • persist: Persist a single changefeed file
  • metrics: Monitor migration progress
  • pubsub: Manage PubSub topics and subscriptions
  • spanner: Manage Spanner database operations
  • changefeed: Create and manage CockroachDB changefeeds
  • validation/validate: Compare and validate data integrity
  • table-list: List supported tables
  • cockroach: Perform CockroachDB operations

Performance Tuning

The migration tool has several parameters that can be tuned:

  • --batch-size: Number of records to batch together (default: 2500)
  • --workers: Number of parallel workers (default: 24)
  • --commit-delay: Delay between commits (default: 300ms)
  • --commit-timeout: Maximum time to wait for a commit (default: 60s)
  • --max-retries: Maximum number of retry attempts (default: 10)

License

Licensed under the GNU Affero General Public License v3.0 (AGPL-3.0).

Copyright

Copyright (C) 2024 Storj Labs, Inc.

About

A tool to efficiently migrate from CockroachDB to Spanner

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published