Skip to content

FEAT: Adding garak.doctor scenario and PolicyPuppetryConverter#2064

Merged
rlundeen2 merged 4 commits into
microsoft:mainfrom
rlundeen2:rlundeen2-garak-doctor-scenario
Jul 2, 2026
Merged

FEAT: Adding garak.doctor scenario and PolicyPuppetryConverter#2064
rlundeen2 merged 4 commits into
microsoft:mainfrom
rlundeen2:rlundeen2-garak-doctor-scenario

Conversation

@rlundeen2

Copy link
Copy Markdown
Contributor

Ports garak's doctor probe to PyRIT as a Doctor scenario built from reusable PyRIT concepts. Objectives live in a garak_doctor seed dataset and are wrapped in a new PolicyPuppetryConverter (HiddenLayer's Policy Puppetry universal bypass), then scored for non-refusal. The scenario exposes attacks as attack-technique factories and does not override get_atomic_attacks.

The converter loads its templates from datasets/prompt_converters/policy_puppetry_converter.yaml (dr_house / medical_advisor); callers select via the PolicyPuppetryTemplate enum while __init__ takes a SeedPrompt (default: a random template). Adds unit tests for the converter and scenario, plus a bibliography entry for the Policy Puppetry article.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@rlundeen2 rlundeen2 force-pushed the rlundeen2-garak-doctor-scenario branch from 61f2229 to f0cd484 Compare June 22, 2026 04:07
@jsong468 jsong468 self-assigned this Jul 1, 2026
Comment thread pyrit/prompt_converter/policy_puppetry_converter.py
Comment thread pyrit/scenario/scenarios/garak/doctor.py Outdated
Comment thread pyrit/prompt_converter/policy_puppetry_converter.py
rlundeen2 and others added 3 commits July 2, 2026 13:59
- Use DatasetAttackConfiguration (owns get_attack_groups_by_dataset_async) so Doctor rides the new base _build_atomic_attacks_async/MatrixAtomicAttackBuilder flow

- Drop the non-existent 'single_turn' aggregate tag from technique factories (DoctorStrategy only defines all/default aggregates)

- Regenerate 0_converters modality table to include PolicyPuppetryConverter

- Update test to new DatasetAttackConfiguration + dataset_names API

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…omic_attacks_async

Add a PolicyPuppetryConverter usage example to the text-to-text converter
notebook so test_all_converters_are_documented passes, and migrate the Doctor
scenario off the base-class _get_attack_technique_factories hook (slated for
deletion in the scenario refactor) onto the _build_atomic_attacks_async
extension point, calling MatrixAtomicAttackBuilder directly.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
@rlundeen2 rlundeen2 enabled auto-merge July 2, 2026 21:52
@rlundeen2 rlundeen2 added this pull request to the merge queue Jul 2, 2026
Merged via the queue into microsoft:main with commit 9345f1c Jul 2, 2026
53 checks passed
@rlundeen2 rlundeen2 deleted the rlundeen2-garak-doctor-scenario branch July 2, 2026 22:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants