feat: add rule for client-side encryption of sensitive fields#183
feat: add rule for client-side encryption of sensitive fields#183ramnnn2006 wants to merge 2 commits into
Conversation
avinashkamat48
left a comment
There was a problem hiding this comment.
The new client-side encryption rule is added under skills/cosmosdb-best-practices/rules, but this PR does not update the generated AGENTS.md/packaged guidance that the skill appears to serve from. Other rule PRs in this repo usually add the rule content to AGENTS.md as well, so as-is the rule can exist on disk while the agent never actually sees or applies it. Could you regenerate or update AGENTS.md so this guidance is included in the runnable skill output?
|
makes sense, the rule was on disk but compile.js skips the security- prefix so it never reached AGENTS.md. added security as a section in compile.js and regenerated. heads up that this also brings in the 5 existing security rules (managed identity, rbac, network restrict, disable local auth, continuous backup) that were sitting unbuilt for the same reason, so the AGENTS.md diff is bigger than just my rule. can take those out if you'd rather keep this PR to the one rule, but figured the skill should actually serve all of them. |
TheovanKraay
left a comment
There was a problem hiding this comment.
Nice rules!
The client-encryption rule only has .NET examples, but Always Encrypted also supports Java (com.azure:azure-cosmos-encryption) and JavaScript. Please add at least a Java example alongside the .NET one since Java is a primary SDK for Cosmos DB workloads. Docs: https://learn.microsoft.com/azure/cosmos-db/how-to-always-encrypted
More importantly: rules 12.2–12.6 (continuous backup, disable keys, managed identity, network access, RBAC) exist only in AGENTS.md, which is a generated file. There are no corresponding source files in rules/. The next time anyone runs npm run build, those 5 rules will be overwritten and lost. Needless to say, thats a blocking issue. Please create individual rule files (security-continuous-backup.md, security-disable-keys.md, etc.) and regenerate AGENTS.md from them. Our convention is one rule per PR, but I'm fine with keeping them together here since they form a cohesive security category.
|
thanks for the review. added a java example next to the .net one for the key vault setup, key creation and the encryption policy, using the com.azure:azure-cosmos-encryption package, pulled from the always encrypted docs you linked. on the security rules being only in AGENTS.md: the source files actually already exist in rules/ (security-managed-identity.md, security-rbac-least-privilege.md, security-network-restrict.md, security-disable-local-auth.md, security-continuous-backup.md). they predate this pr so they dont show in the diff. the real issue was the build wasnt emitting them. main's compile.js reads section definitions from _sections.md frontmatter and falls back to a default that stops at vector, and _sections.md only had prose with no frontmatter array, so security and full-text both had rule files but never compiled. i added the sections array to _sections.md matching the existing prose, so all 13 sections now build from source and nothing gets lost on the next npm run build. heads up this also brings full-text search (12) into AGENTS.md for the same reason, can split that out if you'd rather keep this to security. also rebased onto main to clear the compile.js conflict. |
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds new Cosmos DB best-practice rules (security + LangChain.js guidance), updates docs/site content, and extends repository tooling to support multiple skills and releases.
Changes:
- Added new rule markdown files (Always Encrypted client-side encryption; LangChain.js vector store setup, search types, semantic cache, managed identity, filter injection, embeddings, chat history).
- Generalized build/validate scripts to operate on one or all skills; added a version bump helper script.
- Updated docs/site + introduced a tag-based GitHub release workflow; bumped manifests to 1.1.0.
Reviewed changes
Copilot reviewed 35 out of 38 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
| skills/cosmosdb-best-practices/rules/security-client-encryption.md | New security rule documenting Always Encrypted client-side encryption. |
| skills/cosmosdb-best-practices/rules/sdk-langchain-js-vectorstore-init.md | New rule for correct vector store initialization patterns in JS/TS. |
| skills/cosmosdb-best-practices/rules/sdk-langchain-js-semantic-cache.md | New rule describing semantic cache usage for cost/latency reduction. |
| skills/cosmosdb-best-practices/rules/sdk-langchain-js-search-types.md | New rule explaining searchType choices and prerequisites. |
| skills/cosmosdb-best-practices/rules/sdk-langchain-js-managed-identity.md | New rule recommending managed identity over connection strings. |
| skills/cosmosdb-best-practices/rules/sdk-langchain-js-fulltext-prerequisites.md | New rule documenting full-text + hybrid container prerequisites. |
| skills/cosmosdb-best-practices/rules/sdk-langchain-js-filter-injection.md | New security rule explaining parameterization to prevent filter injection. |
| skills/cosmosdb-best-practices/rules/sdk-langchain-js-embedding-model.md | New rule clarifying Azure OpenAI embedding “deployment name” config. |
| skills/cosmosdb-best-practices/rules/sdk-langchain-js-chat-history.md | New rule for persistent chat history container design and usage. |
| skills/cosmosdb-best-practices/rules/sdk-emulator-ssl.md | Expanded Java guidance for Linux (vNext) emulator HTTPS + added tags. |
| skills/cosmosdb-best-practices/rules/_sections.md | Adds frontmatter-driven section definitions, including Security. |
| skills/cosmosdb-best-practices/metadata.json | Bumps skill version to 1.1.0. |
| skills/cosmosdb-best-practices/README.md | Updates category/rule counts and adds Security category. |
| scripts/version.js | New utility to bump versions across manifests. |
| scripts/validate.js | Adds optional skill scoping and validates multiple skills. |
| scripts/compile.js | Compiles AGENTS.md per-skill; loads sections from rules/_sections.md. |
| package.json | Bumps version; adds scripts for version/skill build+validate. |
| plugin.json | Bumps version to 1.1.0. |
| gemini-extension.json | Bumps version to 1.1.0. |
| apm.yml | Bumps version; changes marketplace source path to repo root. |
| README.md | Adds skill table; adjusts contributing/eval guidance wording. |
| CONTRIBUTING.md | Updates multi-skill instructions; makes eval tasks “optional”. |
| CHANGELOG.md | Adds entry referencing new Java+Linux emulator HTTPS guidance. |
| docs/index.html | Improves SEO metadata + copy; adds marketplace links and demo link. |
| docs/styles.css | Improves integration-card layout so links align consistently. |
| .github/workflows/release.yml | New release workflow that packages skills into zip assets. |
| .github/workflows/eval.yml | Removes PR evaluation workflow. |
| .github/skills/code-review/checklist.md | Adds expanded internal review checklist documentation. |
| .github/instructions/code-review.instructions.md | Adds summarized review gates pointing to the checklist. |
| .github/PULL_REQUEST_TEMPLATE.md | Removes required eval section from PR template. |
| .cursor-plugin/plugin.json | Bumps plugin version to 1.1.0. |
| .codex-plugin/plugin.json | Bumps plugin version to 1.1.0; formats capabilities list. |
| .claude-plugin/plugin.json | Bumps plugin version to 1.1.0. |
| evals/cosmosdb-best-practices/tasks/security-client-encryption.yaml | Adds eval task for new Always Encrypted rule. |
| evals/cosmosdb-best-practices/tasks/sdk-emulator-ssl-java-linux.yaml | Adds eval task for Java + Linux (vNext) emulator HTTPS guidance. |
| - name: Validate rules | ||
| run: npm run validate || true |
| const metadata = JSON.parse(fs.readFileSync(path.join(SKILL_DIR, 'metadata.json'), 'utf8')); | ||
| const SECTIONS = loadSections(RULES_DIR); | ||
|
|
||
| let output = `# Azure Cosmos DB Best Practices |
| ## Overview | ||
|
|
||
| This skill contains 111 rules across 12 categories, ordered by impact: | ||
| This skill contains 118 rules across 13 categories, ordered by impact: |
| | Developer Tooling | MEDIUM | Emulator and extension guidance for day-to-day work | | ||
| | Vector Search | HIGH | Semantic search and RAG-related configuration | | ||
| | Full-Text Search | HIGH | Keyword matching, BM25 ranking, and hybrid search configuration | | ||
| | Security | CRITICAL | Authentication, RBAC, network isolation, and backup configuration | |
|
|
||
| | Skill | Description | Status | | ||
| |-------|-------------|--------| | ||
| | [cosmosdb-best-practices](skills/cosmosdb-best-practices/) | Performance optimization (111 rules, 12 categories) | ✅ Stable | |
| npm run build | ||
|
|
||
| # Validate | ||
| npm run validate:skill {skill-name} |
| "build": "node scripts/compile.js", | ||
| "validate": "node scripts/validate.js" | ||
| "build:skill": "node scripts/compile.js", | ||
| "validate": "node scripts/validate.js", | ||
| "validate:skill": "node scripts/validate.js", | ||
| "version": "node scripts/version.js" |
| - 🔴 Filename: `{prefix}-{description}.md` | ||
| - 🔴 Valid prefixes: `model-`, `partition-`, `query-`, `sdk-`, `index-`, `throughput-`, `global-`, `monitoring-`, `pattern-`, `tooling-`, `vector-` |
| ### Content Quality | ||
|
|
||
| - 🟡 Must be generic — applicable to any Cosmos DB app, not tied to a specific scenario | ||
| - � One rule per PR for new rules (flag PRs bundling unrelated rules) |
| **Rule files** (`skills/*/rules/*.md`): | ||
| - Frontmatter: `title`, `impact` (CRITICAL|HIGH|MEDIUM-HIGH|MEDIUM|LOW-MEDIUM|LOW), `impactDescription`, `tags` | ||
| - Body: `**Incorrect` + `**Correct` sections with fenced code blocks | ||
| - Filename: `{prefix}-{description}.md` (model-, partition-, query-, sdk-, index-, throughput-, global-, monitoring-, pattern-, tooling-, vector-) |
3f9cd85 to
b0de6c2
Compare
Ah yes, you're right, my bad. The rule files were there, the build just wasn't picking them up. One more thing: we recently merged a skill split (#204) that added 13 topic-specific skills alongside the monolith. There's now a cosmosdb-security directory with its own rules/ folder containing the 5 existing security rules. Could you please: Copy your new security-client-encryption.md rule file into rules |
|
@ramnnn2006 there are some ongoing changes being evaluated to the structure that could require this to be modified. you'll definetely get notice when it's time to make any changes to avoid merge conflicts. |
|
@jaydestro okay pls let me know once I can sync the changes and start working |
Description
Adds a Security rule about protecting sensitive fields with Always Encrypted, Cosmos DB's client side encryption. The default encryption at rest gets mistaken for field level protection a lot, when really anyone with read access still sees plaintext. The rule covers when client side encryption is worth it (PII, healthcare, finance type data), the Key Vault setup with a client encryption
policy, the tradeoffs (encrypted fields lose indexing, randomized can't be queried at all, deterministic only does equality), and a plain comparison against encryption at rest. Comes with an eval task.
Type of Change
Checklist
npm run validateand it passednpm run buildto regenerate AGENTS.md (if adding/updating rules){prefix}-{description}.mdtitle,impact,tags)Tests (Required)
evals/cosmosdb-best-practices/tasks/waza run evals/cosmosdb-best-practices/eval.yamland all tasks passid,name,description,inputs.prompt, andexpected.outcomesEval task file:
evals/cosmosdb-best-practices/tasks/security-client-encryption.yamlFor New Rules
Rule file:
skills/cosmosdb-best-practices/rules/security-client-encryption.mdCategory: Security
Impact level: Medium
Why is this rule important?
People see "encrypted at rest by default" and assume their PII is covered, but that encryption is transparent, so every authorized reader (portal included) gets plaintext back. For regulated data the right tool is Always Encrypted, where fields are encrypted in the SDK before leaving the app and the keys sit in your own Key Vault. It also has real tradeoffs around indexing and querying that you have to plan for at container creation, so agents should know both when to suggest it and what it costs.
Agent Testing
Related Issues
Closes #173
Additional Notes
One thing I noticed while building: scripts/compile.js has a hardcoded section list that ends at vector-, so security- rules (this one and the 5 already in the repo) never make it into the compiled AGENTS.md. Didn't touch the script here to keep this PR to one rule, but happy to send a separate fix if you want. Also the 21 validate errors that show up are pre-existing on main, not from this change