Skip to content

[Draft] Use hybrid C# reference map for post-processing#10976

Draft
live1206 wants to merge 44 commits into
microsoft:mainfrom
live1206:mtg-hybrid-reference-map
Draft

[Draft] Use hybrid C# reference map for post-processing#10976
live1206 wants to merge 44 commits into
microsoft:mainfrom
live1206:mtg-hybrid-reference-map

Conversation

@live1206

@live1206 live1206 commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds a hybrid reference-map replacement for C# generated-code post-processing.

The hybrid path replaces broad Roslyn reference-map construction with:

  • provider metadata for generated-code references
  • explicit provider dependencies for known generated body-only references
  • a small Roslyn scan only for custom/shared code roots

No benchmark measurement/profiling code is included in the production generator path in this PR.

Why

The earlier experimental PR measured full generation and identified Roslyn reference-map construction inside GeneratedCodeWorkspace.PostProcessAsync() as the largest hotspot.

This PR keeps generated output parity with the Roslyn cleanup path while moving generated-code reachability to provider metadata.

Latest Benchmark Data

Latest data from benchmark PR #10885 after porting the current #10976 hybrid analyzer changes (including generated body invocation edges and base-preserved reachability).

Benchmark output root: /tmp/typespec-final-hybrid-bench-20260624-0943.

Full-generation BenchmarkDotNet, averaged across 3 local runs:

Mode Avg Mean Avg Allocated
Roslyn reference maps 895.8 ms 67.78 MB
Provider reference map 859.9 ms 57.75 MB

Approximate full-generation improvement:

Time:       ~4.0% faster
Allocation: ~14.8% less

Focused profile data, using 72 Roslyn-mode and 72 provider-mode invocations from the same runs (median):

Path Median Time Median Allocated
Roslyn reference-map construction 357.2 ms 23.70 MB
Provider map analysis + candidate consumption 263.2 ms 17.83 MB
Provider candidate consumption only 0.79 ms 0.06 MB

Approximate focused reference-map improvement:

Time:       ~26.3% faster
Allocation: ~24.8% less

Profile notes:

  • Data comes from POSTPROCESSING_BENCHMARK_PROFILE_STEPS=true on the full-generation benchmark so both paths run against real TypeProvider output.
  • Roslyn rows are PostProcessor.Internalize.BuildPublicReferenceMapAsync and PostProcessor.Remove.BuildAllReferenceMapAsync.
  • Provider rows are Generation.ProviderReferenceMapShadowAnalysis, PostProcessor.Internalize.UseShadowCandidates, PostProcessor.Internalize.UseShadowPublicizeCandidates, PostProcessor.Remove.UseShadowCandidates, and PostProcessor.Remove.BuildShadowReferencedSet.
  • Runtime: .NET 10.0.9, Ubuntu 26.04, AMD EPYC 7763.

Correctness Notes

The hybrid implementation preserves Roslyn cleanup behavior for generated output parity:

  • model factory signatures and bodies do not keep otherwise-unused models alive
  • MRW context/buildable attributes do not keep buildable-only models alive
  • serialization providers are removable together with their owning model
  • retained serialization providers report explicit helper dependencies such as ChangeTrackingDictionary and Optional
  • collection-result providers report explicit body dependencies instead of relying on Roslyn body scanning
  • client providers report explicit body dependencies for collection results, service method types, operation parameters, and operation response body/header types
  • rest-client providers report explicit helper dependencies for generated collection parameter null checks, including ChangeTrackingList and ChangeTrackingDictionary
  • generated body-only references are still handled for static helpers
  • public discriminator subtypes stay public by matching Roslyn public-reference-map derived-class behavior
  • current union/variant roots participate in internalization, matching Roslyn _typesToKeep behavior
  • remove reachability includes current discriminator derived-model edges after root discovery
  • previous generated files under src/Generated are not scanned; reachability is based on current providers/generated workspace plus custom/API roots, matching Roslyn's workspace inputs
  • internal/non-public discriminator constructor/property references do not make discriminator enum types public

Azure SDK DPG Regen Note

Azure/azure-sdk-for-net#60128 completed a full DPG regen with no sdk/**/api/** changes.

The regen has a few extra generated internal model/serialization files. This is expected: the hybrid provider map keeps conservative client body dependencies from service metadata to avoid broad Roslyn body-reference scans and preserve correctness for body-only generated references. These files are internal implementation details and do not affect public API.

Validation

Local validation performed while stabilizing the PR included:

  • dotnet build packages/http-client-csharp/generator/Microsoft.TypeSpec.Generator/src/Microsoft.TypeSpec.Generator.csproj -c Release
  • focused generator tests covering post-processing/workspace/customization scenarios
  • full generator test assembly: 1530/1530 passed
  • regenerated and built previously failing generated projects/scenarios, including:
    • Sample-TypeSpec
    • Spector/http/authentication/api-key
    • Spector/http/parameters/collection-format
    • Spector/http/documentation
    • Spector/http/special-headers/repeatability
    • discriminator inheritance scenarios used by Spector tests

Implemented Generated Dependency Handling

This PR now avoids Roslyn body scanning for several generated cases:

  • collection-result body dependencies are reported by CollectionResultDefinition
  • client body dependencies are reported by ClientProvider
  • rest-client helper dependencies are reported by RestClientProvider
  • serialization provider helper dependencies are reported by serialization providers
  • model factory is treated specially so unreachable model factory methods do not root models
  • non-root MRW context/buildable attributes are excluded from model reachability

Custom/shared code references still use Roslyn because arbitrary user C# can reference generated types in ways providers cannot reliably describe.

Follow-Up Performance Opportunities

Potential next improvements, ordered by reward/risk:

Rank Improvement Reward Risk Notes
1 Precompute name lookup maps for AddMatchingName Medium Low Avoid repeated full-node scans for helper/root matching.
2 Cache flattened provider lists and provider names Medium Low Avoid repeated lazy provider/name materialization during analysis.
3 Conservative custom-code syntax prefilter Medium Medium/High Can reduce custom Roslyn semantic work but must not miss arbitrary custom references.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

@microsoft-github-policy-service microsoft-github-policy-service Bot added the emitter:client:csharp Issue for the C# client emitter: @typespec/http-client-csharp label Jun 12, 2026
@pkg-pr-new

pkg-pr-new Bot commented Jun 12, 2026

Copy link
Copy Markdown

Open in StackBlitz

npm i https://pkg.pr.new/@typespec/http-client-csharp@10976

commit: b8a5c3a

@github-actions

Copy link
Copy Markdown
Contributor

No changes needing a change description found.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@azure-sdk-automation

azure-sdk-automation Bot commented Jun 19, 2026

Copy link
Copy Markdown

You can try these changes here

🛝 Playground 🌐 Website 🛝 VSCode Extension

live1206 and others added 15 commits June 19, 2026 09:38
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

emitter:client:csharp Issue for the C# client emitter: @typespec/http-client-csharp

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant