[rule enhancement] partition rules: add a tie-breaker — for read-heavy workloads, query-pattern alignment beats raw cardinality (a bare `/id` is not automatically the best key)

**Type:** enhancement to existing rules (`partition-high-cardinality`, `partition-query-patterns`) — **conflicting guidance with no precedence**
**Category:** Partitioning (`partition-`)
**Severity:** Medium (leads to a defensible-but-suboptimal key; turns the dominant query into a cross-partition scan; higher RU/latency at scale)
**Affected:** data modeling for any SQL→NoSQL migration or new container design, all SDKs
**Doc reference:** [Partitioning overview](https://learn.microsoft.com/azure/cosmos-db/partitioning-overview) · [Choose a partition key](https://learn.microsoft.com/azure/cosmos-db/partitioning-overview#choose-a-partition-key)

---

## Summary

Two CRITICAL rules in the kit point in different directions for a common case, and there is
no guidance on which wins:

- **`partition-high-cardinality`** — "Select partition keys with many unique values." Its
  **✅ GOOD** examples are `CustomerId`, `TenantId`, `DeviceId` — i.e. a bare per-entity id.
  Read literally, a document's own `/id` is the *maximum*-cardinality choice and looks ideal.
- **`partition-query-patterns`** — "Choose a partition key that supports your most frequent
  queries." Its anti-pattern is a product partitioned by one field while most queries filter
  by another.

For a **read-heavy, rarely-written** dataset that is almost always **filtered by a field**
(a product catalog filtered by category/brand; an orders-by-customer read model; a
content library filtered by type), these two rules disagree:

- High-cardinality says: use the unique id (perfect distribution).
- Query-patterns says: use the field you filter on (single-partition reads).

An agent following the high-cardinality rule will choose `/id`, partition every document into
its own logical partition, and turn every `WHERE category = @c` / `WHERE brand = @b` query
into a **fan-out cross-partition scan** — the exact thing query-patterns warns against. The
choice "looks" correct and even cites the right principle ("even distribution, efficient
single-item reads"), so it passes review.

The missing piece is a **precedence/tie-breaker**: when write volume is low and reads are
dominated by a filter, **query alignment should win over raw cardinality.** A bare `/id` is
the right key primarily when the dominant access pattern is a **point read by that id**, or
when write throughput is so high that write distribution is the binding constraint.

## Scope (important — `/id` is *not* always wrong)

The Microsoft docs are explicit that `/id` is a **great** partition key for two cases, and
this issue is **not** asking to contradict that:

> "For small read-heavy containers or write-heavy containers of any size, the *item ID*
> (`/id`) is naturally a great choice for the partition key."
> — [Partitioning and horizontal scaling › Use item ID as the partition key](https://learn.microsoft.com/azure/cosmos-db/partitioning#use-item-id-as-the-partition-key)

The same page adds the caveat that pins down exactly where `/id` stops being a good fit:

> "If you have a read-heavy container with **many physical partitions**, queries are more
> efficient if they have an equality filter with the *item ID*."

So the gap is narrow and specific: a **read-heavy container that grows past one physical
partition and is filtered by a non-`id` field**. There, `/id` turns the dominant query into a
cross-partition fan-out, while the kit's two CRITICAL rules still give no rule for which one
wins. The ask is a **tie-breaker for that case**, not a blanket "avoid `/id`."

## Benchmark evidence — reproduced end-to-end with the kit loaded

This is a real agent run (not a synthetic test): the kit was loaded and *read*, the agent
understood the access pattern, and still chose `/id`.

The eShop Catalog SQL→Cosmos migration task was run with `claude-opus-4.7` and the
**cosmosdb-best-practices kit installed** (compiled `AGENTS.md` baked into the working dir;
load verified — hook install lines present, `AGENTS.md` pulled into the session 8×, Azure MCP
connected). The run passed **13 of 14** independent checks; the **only** failure was
`partition_key_grouping`.

The agent's own header comment in the generated `Program.cs` (verbatim) shows it had already
worked out the dominant access pattern — it indexed exactly the filter fields — and *still*
partitioned by `/id`:

```csharp
//   * Container "items":   one document per product, partition key /id.
// Indexing on "items":
//   * Include /name, /catalogTypeId, /catalogBrandId (the filter paths)
```

```csharp
var itemsContainerProps = new ContainerProperties("items", "/id");   // the graded miss
```

So with the kit present, the agent recognized that reads filter by type/brand (it built a
composite index and included `/catalogTypeId` + `/catalogBrandId` as "the filter paths"), yet
chose the per-item `/id` partition key — turning the dominant filtered query into a
cross-partition fan-out. The decision "looks" principled and cites high cardinality / even
distribution, exactly as predicted above.

### Whole-kit rule audit (why a faithful agent lands on `/id`)

Auditing the installed kit (119 rules) for the catalog partition decision: **four rules point
at `/id`, only one ambiguous rule points away, and nothing ranks them.**

| Rule | Impact | Stance on a per-item `/id` key |
|---|---|---|
| `partition-high-cardinality` | **CRITICAL** | **Blesses it** — "thousands to millions of unique values… distribute writes evenly." A per-item `/id` is the *maximum*; no carve-out warns that per-item granularity fragments reads. |
| `partition-key-length` | — | **Endorses it** — "Prefer short GUIDs, **IDs**, or codes … for partition keys." |
| `partition-immutable-key` | — | **Satisfied** — `id` never changes. |
| `partition-avoid-hotspots` | — | **Satisfied** — per-item ⇒ zero hot partitions. |
| `partition-query-patterns` | **CRITICAL** | The lone counter — but its anti-pattern partitions by `Category` (never shows `/id` as wrong), all "correct" examples use an obvious **parent entity** (Seller/Customer/Conversation), and it explicitly permits "for less common queries, accept cross-partition." |

A whole-kit search found **zero** rules that warn against a per-item `/id` partition key and
**zero** rules that give precedence when high-cardinality conflicts with query-alignment. With
the guidance 4-blesses-vs-1-ambiguous and no tie-breaker, an agent optimizing the stated
principles chooses `/id` and passes its own review. This is the precise gap the precedence
note below closes.

## Verified against the live SDK + emulator

The same outcome reproduces deterministically at the SDK level with `Microsoft.Azure.Cosmos`
3.46.1 against the Cosmos DB Linux (vNext) emulator. Two containers, identical 30-item data
set, only the partition key differs:

```
truth = 10 items WHERE c.category = 'Footwear'

/category container, query scoped to PartitionKey('Footwear')  -> 10 items  (single-partition, correct)
/id       container, query scoped to PartitionKey('Footwear')  ->  0 items  (cannot be served from one partition)
/id       container, cross-partition (no PartitionKey)          -> 10 items  (correct ONLY when fanned out)
```

The `/id` container can return the correct result for a category filter *only* by fanning out
across partitions — there is no single logical partition that holds "all Footwear," because
the partition key is the per-item id. That is the cross-partition scan the `partition-query-patterns`
rule warns about, reached by following `partition-high-cardinality` to the letter.

## Concrete example (read-heavy catalog)

```csharp
// Access patterns:
//   ~85% : "list products in category X" / "list products for brand Y"   (filtered reads)
//   ~10% : "get product by id"                                            (point read)
//   ~5%  : writes (occasional catalog edits)

// ❌ High-cardinality choice: /id  — perfect distribution, but every filtered list is a
//    cross-partition scan (the 85% case is now the slow, RU-expensive path).
new ContainerProperties("items", "/id");

// ✅ Query-aligned choice: /category (or /brandId) — the 85% filtered reads become
//    single-partition queries; the 10% point read still works via (id + partition key).
new ContainerProperties("items", "/category");
```

## Recommended guidance to add

Add a short precedence note to **both** rules (cross-linked):

> **Cardinality vs. query alignment.** High cardinality matters most when **write
> distribution** is the binding constraint (write-heavy, high-throughput). For **read-heavy**
> workloads dominated by a **filter on one field**, prefer the **field you filter on** as the
> partition key even though its cardinality is lower than `/id` — single-partition reads beat
> a perfectly even write spread you don't need. Choose a bare `/id` when the dominant access
> is a **point read by that id**, or when write throughput genuinely requires maximal spread.
> If the filter field's cardinality is too low (hot-partition risk), use a **synthetic** or
> **hierarchical** key that leads with the filter field (e.g. `/category` then `/id`).

And soften the `partition-high-cardinality` **GOOD** examples so a unique id is shown as
*one* good option **conditioned on access pattern**, not as unconditionally ideal.

## Related: SQL → NoSQL migration note

This bites hardest during relational-to-Cosmos migrations. The instinct is to carry the
table's integer **primary key** over as the document `id` and partition by it. Worth a
one-line callout (here or in a `model-` rule):

> When migrating a relational table to Cosmos DB, **partition by the dominant query
> dimension** (the column you filter on most), **not** the surrogate primary key carried over
> as `id`.

## Suggested placement

Precedence note added to **`partition-high-cardinality.md`** and **`partition-query-patterns.md`**
(both impact: CRITICAL); optional migration one-liner in a `model-` rule or
`partition-synthetic-keys`.

Rule	Impact	Stance on a per-item `/id` key
`partition-high-cardinality`	CRITICAL	Blesses it — "thousands to millions of unique values… distribute writes evenly." A per-item `/id` is the maximum; no carve-out warns that per-item granularity fragments reads.
`partition-key-length`	—	Endorses it — "Prefer short GUIDs, IDs, or codes … for partition keys."
`partition-immutable-key`	—	Satisfied — `id` never changes.
`partition-avoid-hotspots`	—	Satisfied — per-item ⇒ zero hot partitions.
`partition-query-patterns`	CRITICAL	The lone counter — but its anti-pattern partitions by `Category` (never shows `/id` as wrong), all "correct" examples use an obvious parent entity (Seller/Customer/Conversation), and it explicitly permits "for less common queries, accept cross-partition."

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[rule enhancement] partition rules: add a tie-breaker — for read-heavy workloads, query-pattern alignment beats raw cardinality (a bare `/id` is not automatically the best key) #201

Summary

Scope (important — `/id` is not always wrong)

Benchmark evidence — reproduced end-to-end with the kit loaded

Whole-kit rule audit (why a faithful agent lands on `/id`)

Verified against the live SDK + emulator

Concrete example (read-heavy catalog)

Recommended guidance to add

Related: SQL → NoSQL migration note

Suggested placement

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[rule enhancement] partition rules: add a tie-breaker — for read-heavy workloads, query-pattern alignment beats raw cardinality (a bare /id is not automatically the best key) #201

Description

Summary

Scope (important — /id is not always wrong)

Benchmark evidence — reproduced end-to-end with the kit loaded

Whole-kit rule audit (why a faithful agent lands on /id)

Verified against the live SDK + emulator

Concrete example (read-heavy catalog)

Recommended guidance to add

Related: SQL → NoSQL migration note

Suggested placement

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

[rule enhancement] partition rules: add a tie-breaker — for read-heavy workloads, query-pattern alignment beats raw cardinality (a bare `/id` is not automatically the best key) #201

Scope (important — `/id` is not always wrong)

Whole-kit rule audit (why a faithful agent lands on `/id`)