Skip to content

Conversation

@alzarei
Copy link

@alzarei alzarei commented Sep 25, 2025

.NET: Add LINQ-based ITextSearch interface and deprecate legacy ITextSearch (#10456)

Summary

This PR implements Option 3 from the architectural decision process for Issue #10456: introduces a new generic ITextSearch<TRecord> interface with type-safe LINQ filtering while maintaining the legacy ITextSearch interface marked as [Obsolete] for backward compatibility.

Zero breaking changes - existing code continues working unchanged.

What Changed

New Generic Interface (Recommended Path)

public interface ITextSearch<TRecord>
{
    Task<KernelSearchResults<string>> SearchAsync(
        string query, 
        TextSearchOptions<TRecord>? searchOptions = null, 
        CancellationToken cancellationToken = default);
    
    // + GetTextSearchResults and GetSearchResults methods
}

// Type-safe LINQ filtering with IntelliSense
var options = new TextSearchOptions<CorporateDocument>
{
    Filter = doc => doc.Department == "HR" && 
                   doc.IsActive && 
                   doc.CreatedDate > DateTime.Now.AddYears(-2)
};

Benefits:

  • ✅ Compile-time type safety
  • ✅ IntelliSense support for property names
  • ✅ Full LINQ expression support
  • ✅ No RequiresDynamicCode attributes
  • ✅ AOT-compatible (simple equality/comparison patterns)

Legacy Interface (Deprecated)

[Obsolete("Use ITextSearch<TRecord> with LINQ-based filtering instead. This interface will be removed in a future version.")]
public interface ITextSearch
{
    Task<KernelSearchResults<string>> SearchAsync(
        string query, 
        TextSearchOptions? searchOptions = null, 
        CancellationToken cancellationToken = default);
}

// Legacy clause-based filtering (still works)
var options = new TextSearchOptions 
{ 
    Filter = new TextSearchFilter().Equality("Department", "HR") 
};

Migration Message: Users see deprecation warning directing them to modern ITextSearch<TRecord> with LINQ filtering.

Implementation Details

Dual-Path Architecture

VectorStoreTextSearch<TRecord> implements both interfaces with independent code paths:

Legacy Path (Non-Generic):

async IAsyncEnumerable<VectorSearchResult<TRecord>> ExecuteVectorSearchAsync(
    string query, TextSearchOptions options)
{
    var vectorOptions = new VectorSearchOptions<TRecord>
    {
        #pragma warning disable CS0618 // VectorSearchFilter is obsolete
        OldFilter = options.Filter?.FilterClauses != null 
            ? new VectorSearchFilter(options.Filter.FilterClauses) 
            : null
        #pragma warning restore CS0618
    };
    // ... execute search
}

Modern Path (Generic):

async IAsyncEnumerable<VectorSearchResult<TRecord>> ExecuteVectorSearchAsync(
    string query, TextSearchOptions<TRecord> options)
{
    var vectorOptions = new VectorSearchOptions<TRecord>
    {
        Filter = options.Filter  // Direct LINQ passthrough
    };
    // ... execute search
}

Key Characteristics:

  • Two independent methods (no translation layer, no conversion overhead)
  • Legacy path uses obsolete VectorSearchFilter with pragma suppressions (temporary during transition)
  • Modern path uses LINQ expressions directly (no obsolete APIs)
  • Both paths are AOT-compatible (no dynamic code generation)

Files Changed

Interfaces & Options

  • ITextSearch.cs: Added ITextSearch<TRecord> interface, marked legacy ITextSearch as [Obsolete]
  • TextSearchOptions.cs: Added generic TextSearchOptions<TRecord> class

Implementation

  • VectorStoreTextSearch.cs: Implemented dual interface pattern (~30 lines for both paths)

Backward Compatibility (Pragma Suppressions)

Added #pragma warning disable CS0618 to 27 files that use the obsolete interface:

Production (11 files):

  • Web search connectors (Bing, Google, Brave, Tavily)
  • Extension methods (WebServiceCollectionExtensions, TextSearchExtensions)
  • Core implementations (TextSearchProvider, TextSearchStore, VectorStoreTextSearch)

Tests/Samples (16 files):

  • Integration tests (Agents, AzureAISearch, InMemory, Qdrant, Web plugins)
  • Unit tests (Bing, Brave, Google, Tavily)
  • Sample tutorials (Step1_Web_Search, Step2_Search_For_RAG)
  • Mock implementations

Tests

  • Added 7 new tests for LINQ filtering scenarios
  • Maintained 10 existing legacy tests (unchanged)
  • Added DataModelWithTags to test base for collection filtering

Validation Results

  • Build: 0 errors, 0 warnings with --warnaserror
  • Tests: 1,581/1,581 passed (100%)
  • Format: Clean
  • AOT Compatibility: All checks passed
  • CI/CD: Run #29857 succeeded

Breaking Changes

None. This is a non-breaking addition:

  • Legacy ITextSearch interface continues working (marked [Obsolete])
  • Existing implementations (Bing, Google, Azure AI Search) unchanged
  • Migration to ITextSearch<TRecord> is opt-in via deprecation warning

Multi-PR Context

This is PR 2 of 6 in the structured implementation for Issue #10456:

  • PR1 ✅: Generic interfaces foundation
  • PR2 ← YOU ARE HERE: Dual interface pattern + deprecation
  • PR3-PR6: Connector migrations (Bing, Google, Brave, Azure AI Search)

Architectural Decision

Option 3 Approved by Mark Wallace and Westey-m:

"We typically follow the pattern of obsoleting the old API when we introduce the new pattern. This avoids breaking changes which are very disruptive for projects that have a transient dependency." - Mark Wallace

"I prefer a clean separation between the old and new abstractions. Being able to obsolete the old ones and point users at the new ones is definitely valuable." - Westey-m

Options Considered:

  1. Native LINQ Only: Replace TextSearchFilter entirely (breaking change)
  2. Translation Layer: Convert TextSearchFilter to LINQ internally (RequiresDynamicCode cascade, AOT issues)
  3. Dual Interface ✅: Add ITextSearch<TRecord> + deprecate legacy (no breaking changes, clean separation)

See ADR comments in conversation for detailed architectural analysis.

Migration Guide

Before (Legacy - Now Obsolete):

ITextSearch search = ...;
var options = new TextSearchOptions 
{ 
    Filter = new TextSearchFilter()
        .Equality("Department", "HR")
        .Equality("IsActive", "true")
};
var results = await search.SearchAsync("query", options);

After (Modern - Recommended):

ITextSearch<CorporateDocument> search = ...;
var options = new TextSearchOptions<CorporateDocument> 
{ 
    Filter = doc => doc.Department == "HR" && doc.IsActive
};
var results = await search.SearchAsync("query", options);

Next Steps

PR3-PR6 will migrate connector implementations (Bing, Google, Brave, Azure AI Search) to use ITextSearch<TRecord> with LINQ filtering, demonstrating the modern pattern while maintaining backward compatibility.

…obsolete VectorSearchFilter

- Replace obsolete VectorSearchFilter conversion with direct LINQ filtering for simple equality filters
- Add ConvertTextSearchFilterToLinq() method to handle TextSearchFilter.Equality() cases
- Fall back to legacy approach only for complex filters that cannot be converted
- Eliminates technical debt and performance overhead identified in Issue microsoft#10456
- Maintains 100% backward compatibility - all existing tests pass (1,574/1,574)
- Reduces object allocations and removes obsolete API warnings for common filtering scenarios

Addresses Issue microsoft#10456 - PR 2: VectorStoreTextSearch internal modernization
@moonbox3 moonbox3 added the .NET Issue or Pull requests regarding .NET code label Sep 25, 2025
@alzarei alzarei marked this pull request as ready for review September 25, 2025 09:06
@alzarei alzarei requested a review from a team as a code owner September 25, 2025 09:06
@alzarei alzarei force-pushed the feature-text-search-linq-pr2 branch from 0e78309 to 3c9fc7b Compare September 26, 2025 05:44
@alzarei alzarei closed this Sep 26, 2025
@alzarei alzarei deleted the feature-text-search-linq-pr2 branch September 26, 2025 05:46
@alzarei alzarei restored the feature-text-search-linq-pr2 branch September 26, 2025 05:49
@alzarei alzarei deleted the feature-text-search-linq-pr2 branch September 26, 2025 05:52
@alzarei alzarei restored the feature-text-search-linq-pr2 branch September 26, 2025 05:56
@alzarei alzarei reopened this Sep 26, 2025
…pliance

- Replace broad catch-all exception handling with specific exception types
- Add comprehensive exception handling for reflection operations in CreateEqualityExpression:
  * ArgumentNullException for null parameters
  * ArgumentException for invalid property names or expression parameters
  * InvalidOperationException for invalid property access or operations
  * TargetParameterCountException for lambda expression parameter mismatches
  * MemberAccessException for property access permission issues
  * NotSupportedException for unsupported operations (e.g., byref-like parameters)
- Maintain intentional catch-all Exception handler with #pragma warning disable CA1031
- Preserve backward compatibility by returning null for graceful fallback
- Add clear documentation explaining exception handling rationale
- Addresses CA1031 code analysis warning while maintaining robust error handling
- All tests pass (1,574/1,574) and formatting compliance verified
- Add InvalidPropertyFilterThrowsExpectedExceptionAsync: Validates that new LINQ
  filtering creates expressions correctly and passes them to vector store connectors
- Add ComplexFiltersUseLegacyBehaviorAsync: Tests graceful fallback for complex
  filter scenarios when LINQ conversion returns null
- Add SimpleEqualityFilterUsesModernLinqPathAsync: Confirms end-to-end functionality
  of the new LINQ filtering optimization for simple equality filters

Analysis:
- All 15 VectorStoreTextSearch tests pass (3 new + 12 existing)
- All 85 TextSearch tests pass, confirming no regressions
- Tests prove the new ConvertTextSearchFilterToLinq() and CreateEqualityExpression()
  methods work correctly
- Exception from InMemory connector in invalid property test confirms LINQ path is
  being used instead of fallback behavior
- Improves edge case coverage for the filtering modernization introduced in previous commits
@moonbox3 moonbox3 added the kernel Issues or pull requests impacting the core kernel label Sep 28, 2025
- Add NullFilterReturnsAllResultsAsync test to verify behavior when no filter is applied
- Remove unnecessary Microsoft.Extensions.VectorData using statement
- Enhance test coverage for VectorStoreTextSearch edge cases
…INQ filtering

- Extend ConvertTextSearchFilterToLinq to handle AnyTagEqualToFilterClause
- Add CreateAnyTagEqualToExpression for collection.Contains() operations
- Add CreateMultipleClauseExpression for AND logic with Expression.AndAlso
- Add 4 comprehensive tests for new filtering capabilities
- Add RequiresDynamicCode attributes for AOT compatibility
- Maintain backward compatibility with graceful fallback

Fixes microsoft#10456
Fixes IL3051 compilation errors by adding RequiresDynamicCode attributes to:
- SearchAsync(string, TextSearchOptions<TRecord>?, CancellationToken)
- GetTextSearchResultsAsync(string, TextSearchOptions<TRecord>?, CancellationToken)
- GetSearchResultsAsync(string, TextSearchOptions<TRecord>?, CancellationToken)

The generic ITextSearch<TRecord> interface accepts LINQ expressions via
TextSearchOptions<TRecord>.Filter, which requires dynamic code generation
for expression tree processing. This change ensures interface methods
match their implementations' RequiresDynamicCode attributes.

Resolves: Issue microsoft#10456 IL3051 interface mismatch errors
Cherry-pick-safe: Interface-only change, no implementation logic
- Fix CA1859: Use specific return types BinaryExpression? and MethodCallExpression?
  instead of generic Expression? for better performance
- Improve test model: Use IReadOnlyList<string> instead of string[] for Tags property
  to follow .NET collection best practices

These changes address code analyzer warnings and apply reviewer applicable feedback
from other PRs in the Issue microsoft#10456 modernization series.
- Remove LINQ dependency from non-generic ITextSearch interface
- Revert non-generic methods to direct VectorSearchFilter usage
- Eliminates IL3051 warnings by avoiding RequiresDynamicCode on non-generic interface
- Preserves backward compatibility with legacy TextSearchFilter path
- Maintains modern LINQ expressions for generic ITextSearch<TRecord> interface

Architectural separation:
- Non-generic: TextSearchOptions → VectorSearchFilter (legacy path)
- Generic: TextSearchOptions<TRecord> → Expression<Func<TRecord, bool>> (LINQ path)

Resolves remaining IL3051 compilation errors while maintaining Issue microsoft#10456 objectives.
@alzarei
Copy link
Author

alzarei commented Oct 17, 2025

ADR1 (PR2)

status: superseded-by-option-3 (ADR2)
contact: @alzarei
date: 2025-10-17
deciders: architecture-team
consulted: @westey-m, @roji
informed: @markwallace-microsoft

RequiresDynamicCode on ITextSearch Interface

Context and Problem Statement

VectorStoreTextSearch implementation of ITextSearch<TRecord> processes LINQ expressions directly, requiring RequiresDynamicCode attributes on its methods. However, the interface definition lacks these attributes, causing 31 IL3051 compilation errors:

IL3051: Member 'VectorStoreTextSearch.SearchAsync(...)' with 'RequiresDynamicCodeAttribute' 
implements interface member 'ITextSearch<TRecord>.SearchAsync(...)' without 'RequiresDynamicCodeAttribute'. 
Annotations must match across all interface implementations or overrides.

Issue #10456 aims to eliminate technical debt by modernizing from legacy TextSearchFilter to direct LINQ processing, but this creates an architectural mismatch between interface contracts and implementation requirements.

Decision Drivers

Considered Options

Option A: Add RequiresDynamicCode to Interface (Recommended)

public interface ITextSearch<TRecord>
{
    [RequiresDynamicCode("LINQ filtering requires dynamic code generation for expression trees.")]
    Task<KernelSearchResults<string>> SearchAsync(string query, TextSearchOptions<TRecord>? searchOptions = null, CancellationToken cancellationToken = default);
    // ... other methods
}

Advantages:

  • Resolves IL3051 compilation errors
  • Interface contract matches implementation requirements
  • Enables direct LINQ processing in VectorStoreTextSearch
  • Documents AOT constraints accurately

Disadvantages:

  • Interface metadata change (runtime behavior unchanged)
  • Restricts AOT compilation scenarios

Option B: Convert VectorStoreTextSearch to Adapter Pattern

Force VectorStoreTextSearch to use LINQ → Legacy conversion like other implementations.

Advantages:

  • No interface changes required
  • Maintains AOT compatibility

Disadvantages:

Decision

Option A: Add RequiresDynamicCode to ITextSearch interface methods

Rationale

  1. Interface Contract: TextSearchOptions<TRecord>.Filter is Expression<Func<TRecord, bool>>? - dynamic code generation is inherent to the contract
  2. Performance: VectorStoreTextSearch is the primary implementation and benefits from direct LINQ processing
  3. Precedent: Entity Framework uses same pattern - RequiresDynamicCode on interfaces with LINQ contracts
  4. Compatibility: Adapter implementations can continue existing patterns with minimal changes

Implementation Impact

Immediate:

  • Resolves 31 IL3051 compilation errors
  • Enables direct LINQ processing in VectorStoreTextSearch
  • No runtime breaking changes for consumers

Implementation Requirements:

  • Add RequiresDynamicCode attribute to interface methods
  • Existing adapter implementations add attribute (reflects current behavior)
  • Update XML documentation for AOT implications

References

@alzarei
Copy link
Author

alzarei commented Oct 20, 2025

ADR2 (PR2)

status: "accepted", decision makers: @markwallace-microsoft , @westey-m, date: 2025-10-23
contact: @alzarei
date: 2025-10-17
deciders: architecture-team
consulted: @westey-m @roji @markwallace-microsoft
informed:

Dual Interface Pattern for Text Search Filtering

Context and Problem Statement

Issue #10456 aims to modernize text search filtering by adding LINQ-based type safety. However, analysis revealed that forcing both generic and non-generic interfaces through LINQ conversion creates unnecessary technical debt and architectural violations.

Decision

Implement Dual Interface Pattern with Clear Separation:

Generic Interface: Modern LINQ Path

ITextSearch<TRecord>TextSearchOptions<TRecord> → Expression<Func<TRecord, bool>>Direct LINQ
  • Uses RequiresDynamicCode (inherent to LINQ expressions)
  • Provides compile-time type safety and IntelliSense
  • Targets modern applications requiring type safety

Non-Generic Interface: Legacy Compatibility Path

ITextSearchTextSearchOptions → TextSearchFilter → VectorSearchFilter → Legacy Processing
  • No RequiresDynamicCode needed (no LINQ conversion)
  • Maintains backward compatibility for existing applications
  • Direct path to obsolete API for compatibility

Rationale

  1. Technical Debt Clarity: We preserve legacy path for compatibility while providing modern alternative
  2. Performance: Each interface uses its optimal processing path without conversion overhead
  3. Architecture Separation: Clear distinction between legacy and modern approaches
  4. IL3051 Resolution: Non-generic interface doesn't need LINQ, eliminating warnings
  5. Developer Choice: Teams can migrate to generic interface when ready

Implementation Impact

  • Non-generic methods: Revert to direct VectorSearchFilter usage (original baseline approach)
  • Generic methods: Continue using direct LINQ expressions
  • No breaking changes: Existing code continues working unchanged
  • Clear migration path: Developers can upgrade interface when adopting type safety

This approach eliminates IL3051 warnings correctly while maintaining the intended architectural benefits of Issue #10456.

…arning suppressions around reflection operations in LINQ expression building - IL2075 (GetMethod) and IL2060 (MakeGenericMethod) warnings are acceptable for dynamic property access - Fixes GitHub Actions CI/CD pipeline failures where --warnaserror treats warnings as errors - Targeted suppressions with explanatory comments for maintainability
- Add UnconditionalSuppressMessage attributes for IL2075/IL2060 warnings during AOT analysis
- Expand DynamicallyAccessedMembers to include PublicMethods for GetMethod reflection calls
- Maintains RequiresDynamicCode attribute to properly indicate AOT incompatibility
- Addresses AOT test failures where --warnaserror treats warnings as compilation errors

The reflection-based LINQ expression building is inherently incompatible with AOT compilation,
but these attributes allow the build system to handle this known limitation gracefully instead
of failing with cryptic errors. Regular and AOT compilation phases require different suppression
mechanisms - pragma directives for regular builds, UnconditionalSuppressMessage for AOT analysis.
alzarei added a commit to alzarei/semantic-kernel that referenced this pull request Oct 22, 2025
…oreTextSearch refactoring

## Changes Made (All Reviewer Feedback Addressed)

### Exception Handling & Error Strategy (@roji, @westey-m)
- Eliminated all try-catch blocks that returned null (exception swallowing anti-pattern)
- Replaced with explicit ArgumentException and NotSupportedException with descriptive messages
- Improved developer experience with clear, actionable error messages
- All exceptions now bubble up properly for debugging visibility

### Code Quality & Duplication Elimination (@roji)
- Removed 5 duplicate methods: CreateSingleClauseExpression, CreateMultipleClauseExpression, CreateClauseBodyExpression
- Consolidated into unified CreateClauseExpression method using modern switch expressions
- Applied expression-bodied methods throughout for consistency with MEVD patterns
- Eliminated VectorSearchFilter.OldFilter fallback mechanism completely

### AOT Compatibility & Annotations (@roji, @westey-m)
- Fixed IL2091: Updated TextSearchKernelBuilderExtensions DynamicallyAccessedMembers
- Applied RequiresDynamicCode surgically to specific methods, not entire SearchAsync API
- Proper UnconditionalSuppressMessage placement for targeted suppression
- Maintained AOT-friendly public API surface while isolating reflection usage

### Test Updates & Validation
- Updated VectorStoreTextSearchTests to expect improved ArgumentException vs InvalidOperationException
- All 20 tests passing with enhanced error handling validation
- Comprehensive build validation with --warnaserror across all affected projects

## Files Modified
- VectorStoreTextSearch.cs: Core refactoring - removed technical debt, unified expression handling
- TextSearchKernelBuilderExtensions.cs: Fixed AOT annotation mismatch (IL2091)
- TextSearchServiceCollectionExtensions.cs: Related AOT annotation updates
- VectorStoreTextSearchTests.cs: Updated test expectations for improved error handling

## Validation Complete
[PASS] SemanticKernel.Abstractions, Core, UnitTests, AotTests, VectorData.InMemory all build clean with --warnaserror
[PASS] All VectorStoreTextSearch tests passing (20/20)
[PASS] Exception handling properly validates ArgumentException with descriptive messages
[PASS] AOT annotations properly scoped and IL warnings resolved

## Architectural Decision Pending
NOTE: This PR addresses all tactical feedback for the current TextSearchFilter -> LINQ translation approach.
However, @roji raised a fundamental strategic question about API design:

Current Approach (This PR): Keep TextSearchFilter API, translate internally to LINQ
Alternative Approach: Obsolete TextSearchFilter, expose LINQ expressions directly in ITextSearch API

This implementation is complete for Option A (translation layer). If leadership chooses Option B
(native LINQ API), this work would serve as reference implementation while the ITextSearch API
gets redesigned for direct LINQ expression acceptance.

Awaiting architectural direction from @markwallace-microsoft and @westey-m on API strategy before
determining if this is the final implementation or transitional solution. Will add my notes on the ADR comments
@alzarei alzarei force-pushed the feature-text-search-linq-pr2 branch from eaa08ec to 837cf0e Compare October 22, 2025 05:13
alzarei added a commit to alzarei/semantic-kernel that referenced this pull request Oct 22, 2025
…translation layer improvements

Refactored TextSearchFilter to LINQ translation logic in VectorStoreTextSearch per reviewer feedback:

## Exception Handling Modernization
- Eliminated exception swallowing anti-pattern (try-catch returning null)
- Replaced with explicit ArgumentException/NotSupportedException with descriptive messages
- Improved debugging experience with proper error bubbling

## Code Quality & Architecture
- Consolidated 5 duplicate methods into unified CreateClauseExpression using switch expressions
- Applied modern C# patterns (expression bodies, pattern matching)
- Removed VectorSearchFilter.OldFilter legacy fallback mechanism
- Reduced codebase complexity (net -203 lines: +100 insertions, -303 deletions)

## AOT Compatibility
- Fixed IL2091 in TextSearchKernelBuilderExtensions DynamicallyAccessedMembers annotation
- Applied RequiresDynamicCode surgically to reflection-using methods only
- Maintained AOT-friendly public surface while isolating dynamic code requirements

## Validation Results
- All 20 VectorStoreTextSearch tests passing with updated exception expectations
- Build validation passed across all projects with --warnaserror flag
- ArgumentException now thrown instead of InvalidOperationException for better error clarity

## Files Modified
- VectorStoreTextSearch.cs: Core refactoring - removed technical debt, unified expression handling
- TextSearchKernelBuilderExtensions.cs: Fixed AOT annotation mismatch (IL2091)
- TextSearchServiceCollectionExtensions.cs: Related AOT annotation updates
- VectorStoreTextSearchTests.cs: Updated test expectations for improved error handling
@alzarei
Copy link
Author

alzarei commented Oct 22, 2025

See some comments below, but I think there's a more fundamental design decision to be made here...

The approach in this PR reimplements the existing ITextSearch API (TextSearchOptions.Filter, which relies on TextSearchFilter for expressing filters, via LINQ. The other possibility (which I originally had in mind) is to do the same as we did in Microsoft.Extensions.VectorData (MEVD) itself, i.e. obsolete the current filtering mechanism and introduce a new, LINQ-based one.

Exposing LINQ to the user via ITextSearch would allow them to express any filter, rather than the current highly-restricted mechanism based on FilterClause (this is the original reason we switched to LINQ in MEVD). If we go this way, then for VectorStoreTextSearch we'd simply flow the LINQ expression tree from the ITextSearch implementation to MEVD - no translation/work necessary. For other ITextSearch implementations, we'd need to implement very basic LINQ providers which translate the LINQ expression tree (just like we did in MEVD). This may represent a bit more work, but seems like the right way forward, assuming we want ITextSearch to have a robust filtering story.

(incidentally, this approach would also avoid any NativeAOT/trimming issues, as we wouldn't need to generate any LINQ expression nodes).

@markwallace-microsoft @westey-m what do you think?






@roji @westey-m @markwallace-microsoft

@roji, Thank you for raising this architectural question and adding details to the original Issue #10456 vision to help determine a strategic decision.

Context: Option 3 Implementation While Awaiting Decision

While awaiting guidance on the architectural direction, I implemented Option 3 to demonstrate how the dual interface approach works end-to-end. The current state of this PR (commit d1f2733) shows a complete working implementation that avoids breaking changes while the team decides on the long-term path.

The Breaking Change Dilemma

// The obsolete class at the center of this:
[Obsolete("Use VectorSearchOptions.Filter instead of VectorSearchOptions.OldFilter")]  
public sealed class VectorSearchFilter

// Our choices:
// 1. Native LINQ: Replace TextSearchFilter entirely → BREAKING CHANGE (API change)
// 2. Dual Pattern + Avoid Obsolete: Keep TextSearchFilter but convert → BREAKING CHANGE (RequiresDynamicCode)  
// 3. Dual Pattern + Use Obsolete: Keep TextSearchFilter, use obsolete API → NO BREAKING CHANGE

The Three Options with Tradeoffs

Option 1: Native LINQ Only:

  • Replace TextSearchFilter entirely with Expression<Func<T, bool>>
  • Remove non-generic ITextSearch interface
  • BREAKING CHANGE: Requires user migration
  • Benefits: Best long-term architecture, unlimited expressiveness, type safety, eliminates obsolete API

Option 2: Dual Interface + Translation Layer (This PR minus the last commit (commit d1f2733)):

  • Keep both ITextSearch and ITextSearch
  • Convert TextSearchFilter to LINQ internally (approximately 150 lines)
  • BREAKING CHANGE: RequiresDynamicCode cascades to all TextSearch APIs, extension methods, plugins
  • Benefits: Avoids obsolete API usage

Option 3: Dual Interface + Use Obsolete (This PR):

  • Keep both ITextSearch and ITextSearch
  • Use obsolete VectorSearchFilter directly with pragma warnings
  • NO BREAKING CHANGE: Existing code unaffected
  • Trade-off: Maintains obsolete API dependency

AOT Connection & Ecosystem Impact

The RequiresDynamicCode issue adds tradeoff dimensions to the full scope of the architectural decision. Option 2's translation layer forces dynamic code generation for ALL filtering operations AND propagates through the entire API surface:

// The cascading effect:
ITextSearch.SearchAsync()                    // [RequiresDynamicCode] 
→ textSearch.CreateSearch()                  // Must add [RequiresDynamicCode]
→ textSearch.CreatePlugin()                  // Must add [RequiresDynamicCode] 
→ kernel.Plugins.AddFromTextSearch()         // Must add [RequiresDynamicCode]
→ ANY consumer code using TextSearch plugins // Gets AOT warnings

This makes the entire TextSearch plugin ecosystem AOT-incompatible, while Option 1 (native LINQ) would keep simple operations AOT-compatible.

Option 3 Implementation Details (This PR)

To demonstrate how Option 3 works end-to-end, this PR implements:

// Legacy interface (unchanged - backward compatible)
public interface ITextSearch
{
    Task<KernelSearchResults<string>> SearchAsync(string query, TextSearchOptions? options = null);
    // Uses TextSearchFilter (clause-based) → VectorSearchOptions.OldFilter (obsolete)
}

// New generic interface (modern LINQ filtering)
public interface ITextSearch<TRecord>
{
    Task<KernelSearchResults<string>> SearchAsync(string query, TextSearchOptions<TRecord>? options = null);
    // Uses Expression<Func<TRecord, bool>> → VectorSearchOptions.Filter (not obsolete)
}

// Both implemented by VectorStoreTextSearch<TRecord> with separate code paths
public class VectorStoreTextSearch<TRecord> : ITextSearch, ITextSearch<TRecord>
{
    // Legacy path: Direct use of obsolete API with pragma suppression
    async IAsyncEnumerable<VectorSearchResult<TRecord>> ExecuteVectorSearchAsync(
        string query, TextSearchOptions options)
    {
        var vectorOptions = new VectorSearchOptions
        {
            #pragma warning disable CS0618
            OldFilter = options.Filter?.FilterClauses != null 
                ? new VectorSearchFilter(options.Filter.FilterClauses) 
                : null
            #pragma warning restore CS0618
        };
    }

    // Modern path: Direct passthrough of LINQ expression
    async IAsyncEnumerable<VectorSearchResult<TRecord>> ExecuteVectorSearchAsync(
        string query, TextSearchOptions<TRecord> options)
    {
        var vectorOptions = new VectorSearchOptions
        {
            Filter = options.Filter  // No conversion, no RequiresDynamicCode
        };
    }
}

Key aspects:

  • Two independent code paths - no translation layer, no conversion overhead
  • Both paths are AOT-compatible - no RequiresDynamicCode anywhere
  • Existing code continues working unchanged

Analysis

All paths except Option 3 involve breaking changes - the question is which provides the most value:

Option 1 advantages:

  • Better AOT story (simple operations remain AOT-compatible)
  • Aligns with Microsoft.Extensions.VectorData patterns
  • Eliminates technical debt permanently

Option 3 advantages:

  • Unblocks PR3-PR6 (Bing, Google, Azure AI Search connectors)
  • Enables gradual LINQ adoption without forced migration
  • Can migrate to Option 1 later with proper planning

Decision Needed

Which approach should we take:

  1. Native LINQ only (remove legacy interface)
  2. Native LINQ + Translation Layer (avoid obsolete API)
  3. Native LINQ + Legacy (use obsolete API) - This PR

Questions:

…translation layer improvements

Refactored TextSearchFilter to LINQ translation logic in VectorStoreTextSearch per reviewer feedback:

- Eliminated exception swallowing anti-pattern (try-catch returning null)
- Replaced with explicit ArgumentException/NotSupportedException with descriptive messages
- Improved debugging experience with proper error bubbling

- Consolidated 5 duplicate methods into unified CreateClauseExpression using switch expressions
- Applied modern C# patterns (expression bodies, pattern matching)
- Removed VectorSearchFilter.OldFilter legacy fallback mechanism
- Reduced codebase complexity (net -203 lines: +100 insertions, -303 deletions)

- Fixed IL2091 in TextSearchKernelBuilderExtensions DynamicallyAccessedMembers annotation
- Applied RequiresDynamicCode surgically to reflection-using methods only
- Maintained AOT-friendly public surface while isolating dynamic code requirements

- All 20 VectorStoreTextSearch tests passing with updated exception expectations
- Build validation passed across all projects with --warnaserror flag
- ArgumentException now thrown instead of InvalidOperationException for better error clarity

- VectorStoreTextSearch.cs: Core refactoring - removed technical debt, unified expression handling
- TextSearchKernelBuilderExtensions.cs: Fixed AOT annotation mismatch (IL2091)
- TextSearchServiceCollectionExtensions.cs: Related AOT annotation updates
- VectorStoreTextSearchTests.cs: Updated test expectations for improved error handling

SURGICAL RequiresDynamicCode Implementation:
- Removed RequiresDynamicCode from ITextSearch<TRecord> interface methods
- Removed RequiresDynamicCode from AOT-compatible implementation methods
- Kept RequiresDynamicCode only on methods using reflection/dynamic compilation
- Simple equality filtering now AOT-compatible per @roji feedback
- Contains operations properly marked as requiring dynamic code

Complete Reviewer Feedback Resolution (7/7):
- Strategic architectural response to @roji concerns
- Exception swallowing elimination with proper error messages
- OldFilter fallback removal - no more legacy VectorSearchFilter usage
- Code duplication consolidation - unified CreateClauseExpression method
- Modern C# syntax - expression-bodied methods and switch expressions
- Surgical RequiresDynamicCode placement per @roji AOT requirements
- Systematic tracking process with comprehensive feedback document

Technical Changes:
- Interface methods (TextSearchOptions<TRecord>) are AOT-compatible
- Legacy methods (TextSearchOptions) properly marked for dynamic code
- Reflection-based Contains operations retain RequiresDynamicCode
- Added AOT compatibility documentation in interface
- All 90 TextSearch unit tests pass
@alzarei alzarei force-pushed the feature-text-search-linq-pr2 branch from 837cf0e to a802eaa Compare October 22, 2025 12:24
@alzarei alzarei requested review from roji and westey-m October 22, 2025 12:25
@alzarei
Copy link
Author

alzarei commented Oct 23, 2025

@roji @markwallace-microsoft @westey-m while waiting for the architecture decision input from you, I am entertaining and validating option 3.

Add ITextSearch<TRecord> generic interface with LINQ filtering while maintaining
existing ITextSearch non-generic interface for backward compatibility.

## Background: Architectural Decision (Issue microsoft#10456)

Three options considered:

Option 1 (Native LINQ): Replace TextSearchFilter with Expression<Func<T, bool>>
- Breaking change: requires user migration
- Best long-term architecture

Option 2 (Translation Layer): Convert TextSearchFilter to LINQ internally
- Breaking change: RequiresDynamicCode propagates through API surface
- Reflection overhead, AOT incompatible

Option 3 (Dual Interface): Add ITextSearch<TRecord> alongside ITextSearch
- No breaking changes
- Maintains AOT compatibility
- Uses obsolete VectorSearchFilter in legacy path (temporary during transition)

## Implementation

### Generic Interface
- ITextSearch<TRecord> with 3 methods accepting TextSearchOptions<TRecord>
- TextSearchOptions<TRecord> with Expression<Func<TRecord, bool>>? Filter
- Explicit interface implementation in VectorStoreTextSearch<TRecord>

### Dual-Path Architecture
Two independent code paths, no translation layer:

Legacy path (non-generic):
- ITextSearch with TextSearchOptions and TextSearchFilter (clause-based)
- Uses VectorSearchOptions.OldFilter (obsolete) with pragma warning suppression
- No dynamic code, AOT compatible
- 10 existing tests unchanged

Modern path (generic):
- ITextSearch<TRecord> with TextSearchOptions<TRecord> and Expression filter
- Uses VectorSearchOptions.Filter (LINQ native, not obsolete)
- No dynamic code, AOT compatible
- 7 new tests

## Changes

- Added ITextSearch<TRecord> interface and TextSearchOptions<TRecord> class
- Implemented dual interface in VectorStoreTextSearch<TRecord>
- Deleted ~150 lines of Option 2 translation layer code
- Removed all RequiresDynamicCode attributes
- Removed DynamicallyAccessedMemberTypes.PublicMethods from 5 locations:
  - VectorStoreTextSearch.cs
  - TextSearchServiceCollectionExtensions.cs (3 methods)
  - TextSearchKernelBuilderExtensions.cs (1 method)
- Deleted 7 Option 2 translation tests
- Added 7 LINQ filtering tests
- Added DataModelWithTags to test base
- Reverted Program.cs to original state

## Files Changed

8 files, +144 insertions, -395 deletions

- ITextSearch.cs
- TextSearchOptions.cs (added generic class)
- VectorStoreTextSearch.cs (removed translation layer, added dual interface)
- TextSearchServiceCollectionExtensions.cs (removed PublicMethods annotation)
- TextSearchKernelBuilderExtensions.cs (removed PublicMethods annotation)
- VectorStoreTextSearchTestBase.cs (added DataModelWithTags)
- VectorStoreTextSearchTests.cs (removed 7 tests, added 7 tests)
- Program.cs in AotTests (removed suppression, restored tests)
@markwallace-microsoft
Copy link
Member

@roji @markwallace-microsoft @westey-m while waiting for the architecture decision input from you, I am entertaining and validating option 3.

We typically follow the pattern of obsoleting the old API when we introduce the new pattern. This avoids breaking changes which are very disruptive for projects that have a transient dependency.

@westey-m
Copy link
Contributor

@alzarei, I prefer a clean separation between the old and new abstractions. Being able to obsolete the old ones and point users at the new ones is definitely valuable. Then we can remove the obsoleted code after a period of time.

- Add [Obsolete] attribute to ITextSearch interface
- Add pragma suppressions in production classes for backward compatibility
- Add pragma suppressions in test/sample files
- Follows Microsoft pattern: introduce new API, deprecate old
- ITextSearch<TRecord> is the replacement with LINQ filtering

Build: 0 errors, 0 warnings
Tests: 1,581 passed
@alzarei alzarei force-pushed the feature-text-search-linq-pr2 branch from 88244e4 to e333850 Compare October 25, 2025 06:40
@alzarei alzarei changed the title .Net: feat: Eliminate obsolete VectorSearchFilter technical debt in VectorStoreTextSearch (microsoft#10456) .Net: feat: Add ITextSearch<TRecord> with LINQ filtering and deprecate legacy ITextSearch (microsoft#10456) Oct 25, 2025
@alzarei alzarei changed the title .Net: feat: Add ITextSearch<TRecord> with LINQ filtering and deprecate legacy ITextSearch (microsoft#10456) .NET: feat: Add ITextSearch<TRecord> with LINQ filtering and deprecate legacy ITextSearch (microsoft#10456) Oct 25, 2025
@alzarei
Copy link
Author

alzarei commented Oct 25, 2025

@roji @westey-m @markwallace-microsoft

Thank you all for the excellent feedback! After your review and architectural guidance, we pivoted from Option 2 (translation layer, commits c2c783b - a802eaa) to Option 3 (dual interface pattern) in commit d1f2733.

This removed the ~150 lines of translation layer code that had the issues you identified. Current implementation:

Legacy path: Direct OldFilter usage with pragma (lines 284-289)
Modern path: Direct LINQ passthrough (line 305)
No dynamic code generation, no exception handling, no RequiresDynamicCode
The architectural discussion led to the current solution - Option 3 with [Obsolete] marking (commit e333850). Thank you for the input and guidance!

@alzarei alzarei changed the title .NET: feat: Add ITextSearch<TRecord> with LINQ filtering and deprecate legacy ITextSearch (microsoft#10456) .NET: Add ITextSearch<TRecord> with LINQ filtering and deprecate legacy ITextSearch (#10456) Oct 25, 2025
Copy link
Member

@roji roji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alzarei can you please push your changes into a single PR? I'm getting a bit confused as to which branch corresponds to which PRs (there are quite a few), and this PR represents incremental work on top of something else. This is all making reviewing quite complex.

Just to raise the obvious: this PR adds the new LINQ-based filter support to VectorStoreTextSearch, but all the other implementations are left implementing the now-obsolete non-generic ITextSearch; it's problematic for these implementations to not have a non-obsolete way of being used. I can see that ultimately a feature-text-search-linq is being targeted - it's indeed a good idea to do this work in a feature branch so that everything can be done across all providers and samples before releasing anything.

@alzarei
Copy link
Author

alzarei commented Oct 26, 2025

@roji @westey-m @markwallace-microsoft

All review feedback has been addressed. PR is ready for final review.

Validation Results

All validation steps completed successfully:

Format Check - PASSED

dotnet format SK-dotnet.slnx --verify-no-changes
  • Status: PASSED (no formatting changes needed)

Build with Warnings as Errors - PASSED

dotnet build SK-dotnet.slnx --configuration Release --warnaserror
  • Errors: 0
  • Warnings: 0
  • Matches GitHub Actions CI/CD requirements

Core Unit Tests - PASSED

dotnet test SemanticKernel.UnitTests.csproj --configuration Release --no-build
  • Failed: 0
  • Passed: 1,581
  • Skipped: 0
  • Total: 1,581
  • Duration: 2 seconds

AOT Compatibility - PASSED

dotnet publish --configuration Release (SemanticKernel.AotTests)
  • Errors: 0
  • Warnings: 0
  • Successfully published to win-x64

- Add using System.Threading and System.Threading.Tasks directives
- Replace fully-qualified type names with short names (Task, CancellationToken)
- Remove repetitive documentation line about LINQ filtering

Addresses inline code feedback from @roji in PR microsoft#13179
@alzarei
Copy link
Author

alzarei commented Oct 28, 2025

@alzarei can you please push your changes into a single PR? ...

@roji Thanks for the feedback! Let me clarify the multi-PR strategy:

This is Part 2 of 6 in Issue #10456

PR Chain (all target feature-text-search-linq):

PR Status Scope
PR #13175 Merged (Oct 7) Add ITextSearch<TRecord> interface
PR #13179 This PR VectorStoreTextSearch
PR #13188 Addressing feedback Migrate BingTextSearch
PR #13190 Validating chagnes Migrate GoogleTextSearch
PR #13191 Validating changes Migrate Tavily & Brave
PR #13194 validating changes Update samples & docs

Your Concern About Obsolete Implementations

"all the other implementations are left implementing the now-obsolete non-generic ITextSearch"

You're absolutely right to raise this! PR3-PR6 address exactly this issue.

After all 6 PRs merge, every implementation will have both interfaces:

  • VectorStoreTextSearch: ITextSearch<TRecord> (modern) + ITextSearch (legacy)
  • BingTextSearch: ITextSearch<BingWebPage> (modern) + ITextSearch (legacy)
  • GoogleTextSearch: ITextSearch<GoogleWebPage> (modern) + ITextSearch (legacy)
  • TavilyTextSearch: ITextSearch<TavilyWebPage> (modern) + ITextSearch (legacy)
  • BraveTextSearch: ITextSearch<BraveWebPage> (modern) + ITextSearch (legacy)

Pattern (Option 3 - applied consistently in all PRs):

#pragma warning disable CS0618 // ITextSearch is obsolete - backward compatibility
public sealed class BingTextSearch : ITextSearch, ITextSearch<BingWebPage>
#pragma warning restore CS0618
{
    // Modern LINQ path
    Task<KernelSearchResults<string>> ITextSearch<BingWebPage>.SearchAsync(...) { }
    
    // Legacy path (backward compatible)
    Task<KernelSearchResults<string>> ITextSearch.SearchAsync(...) { }
}

Why Separate PRs?

  1. Each connector has unique patterns: Bing REST URLs, Google JSON API, Tavily/Brave web search - distinct conversion logic
  2. Easier review: 5 focused PRs (~15-30 files) vs 1 massive PR (~93 files)
  3. Parallel development: Independent CI/CD validation per connector
  4. Clear separation: Architecture (PR2) vs implementations (PR3-6)

Review Approach

Suggested: Review PR2 for the deprecation pattern architecture, then PR3-6 will apply it consistently across connectors.

Alternative: I can consolidate if you prefer to see everything in one PR.

Does this clarify the strategy? Happy to adjust if you have an approach in mind that makes it easier to reviewe!

Thanks!

@alzarei alzarei requested a review from roji October 28, 2025 08:11
@alzarei
Copy link
Author

alzarei commented Oct 28, 2025

@westey-m @roji @markwallace-microsoft

The integration deployment went through successfully 🎉 and all feedback has been addressed. Feel free to take a look, and if everything looks good, let's move forward. I'm wrapping up the next PRs in this batch to stay aligned with the ADRs we made here. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kernel.core kernel Issues or pull requests impacting the core kernel .NET Issue or Pull requests regarding .NET code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants