|
15 | 15 |
|
16 | 16 | - `dependency`: These tests are a set of very minimal end-to-end integration tests that ensure basic functionality works to upsert and query vectors from an index. These are rarely run locally; we use them in CI to confirm the client can be used when installed with a large matrix of different python versions and versions of key dependencies. See [`.github/workflows/testing-dependency.yaml`](https://github.com/pinecone-io/pinecone-python-client/blob/main/.github/workflows/testing-dependency.yaml) for more details on how these are run. |
17 | 17 |
|
18 | | -- `integration`: These are a large suite of end-to-end integration tests exercising most of the core functions of the product. They are slow and expensive to run, but they give the greatest confidence the SDK actually works end-to-end. See notes below on how to setup the required configuration and run individual tests if you are iterating on a bug or feature and want to get more rapid feedback than running the entire suite in CI will give you. In CI, these are run using [`.github/workflows/testing-dependency.yaml`](https://github.com/pinecone-io/pinecone-python-client/blob/main/.github/workflows/testing-integration.yaml). |
| 18 | +- `integration`: These are a large suite of end-to-end integration tests exercising most of the core functions of the product. They are slow and expensive to run, but they give the greatest confidence the SDK actually works end-to-end. See notes below on how to setup the required configuration and run individual tests if you are iterating on a bug or feature and want to get more rapid feedback than running the entire suite in CI will give you. In CI, these are run using [`.github/workflows/testing-integration.yaml`](https://github.com/pinecone-io/pinecone-python-client/blob/main/.github/workflows/testing-integration.yaml). |
19 | 19 |
|
20 | 20 | - `integration-manual`: These are integration tests that are not run automatically in CI but can be run manually when needed. These typically include tests for features that are expensive to run (like backups and restores), tests that require special setup (like proxy configuration), or tests that exercise edge cases that don't need to be validated on every PR. To run these manually, use: `poetry run pytest tests/integration-manual` |
21 | 21 |
|
@@ -76,6 +76,47 @@ If I see one or a few tests broken in CI, I will run just those tests locally wh |
76 | 76 | - Run the tests in a single file: `poetry run pytest tests/integration/db/control/sync/resources/index/test_create.py` |
77 | 77 | - Run a single test `poetry run pytest tests/integration/db/control/sync/resources/index/test_list.py::TestListIndexes::test_list_indexes_includes_ready_indexes` |
78 | 78 |
|
| 79 | +### Test Sharding |
| 80 | + |
| 81 | +To speed up CI runs, we use a custom pytest plugin to shard (split) tests across multiple parallel jobs. This allows us to run tests in parallel across multiple CI workers, reducing overall test execution time. |
| 82 | + |
| 83 | +The sharding plugin is automatically available when running pytest (registered in `tests/conftest.py`). To use it: |
| 84 | + |
| 85 | +**Command-line options:** |
| 86 | +```sh |
| 87 | +# Run shard 1 of 3 |
| 88 | +poetry run pytest tests/integration/rest_sync --splits=3 --group=1 |
| 89 | + |
| 90 | +# Run shard 2 of 3 |
| 91 | +poetry run pytest tests/integration/rest_sync --splits=3 --group=2 |
| 92 | + |
| 93 | +# Run shard 3 of 3 |
| 94 | +poetry run pytest tests/integration/rest_sync --splits=3 --group=3 |
| 95 | +``` |
| 96 | + |
| 97 | +**Environment variables (alternative to command-line options):** |
| 98 | +```sh |
| 99 | +# Set environment variables instead of using --splits and --group |
| 100 | +export PYTEST_SPLITS=3 |
| 101 | +export PYTEST_GROUP=1 |
| 102 | +poetry run pytest tests/integration/rest_sync |
| 103 | +``` |
| 104 | + |
| 105 | +**How it works:** |
| 106 | +- Tests are distributed across shards using a hash-based algorithm, ensuring deterministic assignment (the same test will always be in the same shard) |
| 107 | +- Tests are distributed evenly across all shards |
| 108 | +- The `--group` parameter is 1-indexed (first shard is 1, not 0) |
| 109 | +- All shards must be run to execute the complete test suite |
| 110 | + |
| 111 | +**In CI:** |
| 112 | +The CI workflows (`.github/workflows/testing-integration.yaml`) automatically use sharding to split tests across multiple parallel jobs. Each job runs a different shard, allowing tests to execute in parallel and complete faster. Different test suites use different shard counts based on their size: |
| 113 | +- `rest_sync` tests: 8 shards |
| 114 | +- `rest_asyncio` tests: 5 shards |
| 115 | +- `grpc` tests: No sharding (runs all tests in a single job, including `tests/integration/rest_sync/db/data` with `USE_GRPC='true'`) |
| 116 | + |
| 117 | +**Local development:** |
| 118 | +When running tests locally, you typically don't need to use sharding unless you want to simulate the CI environment or test the sharding functionality itself. |
| 119 | + |
79 | 120 | ### Fixtures and other test configuration |
80 | 121 |
|
81 | 122 | Many values are read from environment variables (from `.env`) or set in CI workflows such as `.github/workflows/testing-integration.yaml`. |
|
0 commit comments