Skip to content

Conversation

@Rishabh4275
Copy link
Contributor

@Rishabh4275 Rishabh4275 commented Oct 31, 2025

Python Purview: Add Caching and Background Processing

Motivation and Context

This PR enhances the Python Purview integration with performance optimizations and reliability improvements that address follow-up opportunities identified in PR #1142.

Why is this change required?

  • Protection scope computation can be expensive and repetitive, impacting performance
  • Content activities and offline policy evaluations should not block the main execution flow
  • 402 Payment Required errors should be handled gracefully to avoid repeated failed API calls

What problem does it solve?

  1. Performance: Reduces redundant API calls to Microsoft Graph by caching protection scope responses
  2. Responsiveness: Moves non-blocking operations (content activities, offline evaluations) to background tasks
  3. Resilience: Prevents cascading failures from licensing issues through error caching

What scenario does it contribute to?
This change improves the production readiness of Purview middleware by making it more performant and resilient for high-throughput conversational AI applications that enforce data governance policies.

Description

This PR adds comprehensive caching and background processing capabilities to the Python Purview integration.

  • Caching System
  • Background Processing:
  • Content activities are submitted asynchronously without blocking agent execution
  • Offline policy evaluations are queued in background for non-inline execution modes
  • Graceful error handling for background tasks with proper logging
  • Added correlationId in the serviceCalls and responses

Key Implementation Details:

  • Cache keys use SHA256 hashing of normalized request JSON
  • Thread-safe operations for cache size tracking
  • LRU-style eviction when size limits are exceeded
  • Background tasks use Python's asyncio.create_task for non-blocking execution

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? No - all changes are backward compatible with default settings

Copilot AI review requested due to automatic review settings October 31, 2025 20:29
@markwallace-microsoft markwallace-microsoft added documentation Improvements or additions to documentation python labels Oct 31, 2025
@github-actions github-actions bot changed the title [Python][Purview] Add Caching and background processing in Python Purview Middleware Python: [Python][Purview] Add Caching and background processing in Python Purview Middleware Oct 31, 2025
@markwallace-microsoft
Copy link
Member

markwallace-microsoft commented Oct 31, 2025

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/purview/agent_framework_purview
   _cache.py71592%82, 84–86, 88
   _client.py1201488%110, 129–132, 150, 178–179, 186, 190–191, 195–197
   _exceptions.py80100% 
   _middleware.py941188%69–71, 96, 98–100, 104, 164, 188, 192
   _models.py4569678%222–226, 308, 310, 332, 334, 338, 365, 369, 416–421, 424–431, 442–445, 456–459, 489–490, 493, 495, 497–499, 533, 535, 537, 539, 541, 543, 545, 550–552, 554, 557, 563, 566, 605, 607, 609, 611, 613, 621, 623, 625, 627, 661, 665, 700, 702, 704, 708, 714, 716, 738–739, 764, 766, 768, 772, 791–793, 798–799, 832, 834, 846, 909, 912–916, 918, 946, 977, 979
   _processor.py1662286%161, 234, 237–241, 243, 245–246, 248–249, 267–268, 271–274, 276, 282, 284, 337
   _settings.py380100% 
TOTAL13035193985% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
1740 107 💤 0 ❌ 0 🔥 33.413s ⏱️

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements caching, background processing, and enhanced exception handling for the Purview integration. The main changes include:

  • Added protection scopes response caching with configurable TTL and size limits
  • Implemented background task processing for content activities and offline policy evaluation
  • Added support for execution modes (inline vs. offline) and payment required error handling
  • Removed the process_inline setting in favor of dynamic execution mode detection
  • Enhanced exception handling with ignore_exceptions and ignore_payment_required configuration options

Reviewed Changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
agent_framework_purview/_cache.py New cache provider implementation with in-memory caching
agent_framework_purview/_processor.py Added caching, background processing, and execution mode handling
agent_framework_purview/_settings.py Removed process_inline, added cache settings and exception handling flags
agent_framework_purview/_client.py Added scope identifier and Prefer header handling, 402 error support
agent_framework_purview/_middleware.py Enhanced exception handling for payment required and general errors
agent_framework_purview/_exceptions.py Added PurviewPaymentRequiredError exception
agent_framework_purview/_models.py Added ExecutionMode enum and scope_identifier to requests
tests/test_*.py Updated tests for new functionality and behavior changes
README.md Updated documentation with new features and configuration options
Comments suppressed due to low confidence (2)

python/packages/purview/agent_framework_purview/_cache.py:13

  • The global variable 'T' is not used.
T = TypeVar("T")

python/packages/purview/agent_framework_purview/_cache.py:83

  • This import of module json is redundant, as it was previously imported on line 7.
            import json

@Rishabh4275 Rishabh4275 marked this pull request as draft October 31, 2025 20:43
@Rishabh4275 Rishabh4275 force-pushed the users/richawla/psdk branch 2 times, most recently from f0addc7 to 7153e3d Compare November 1, 2025 02:42
@Rishabh4275 Rishabh4275 marked this pull request as ready for review November 1, 2025 02:43
@Rishabh4275 Rishabh4275 changed the title Python: [Python][Purview] Add Caching and background processing in Python Purview Middleware Python: [Purview] Add Caching and background processing in Python Purview Middleware Nov 1, 2025
Copy link
Member

@eavanvalkenburg eavanvalkenburg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

other then some formatting, looks good. However I would like to see some samples with caching and a custom cache provider.

@Rishabh4275 Rishabh4275 force-pushed the users/richawla/psdk branch 3 times, most recently from f27c9c3 to 7c11166 Compare November 7, 2025 01:17
@dmytrostruk dmytrostruk added this pull request to the merge queue Nov 7, 2025
Merged via the queue into microsoft:main with commit 64826b8 Nov 7, 2025
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants