diff --git a/skills/.experimental/app-testing/SKILL.md b/skills/.experimental/app-testing/SKILL.md new file mode 100644 index 00000000..dfa3cc54 --- /dev/null +++ b/skills/.experimental/app-testing/SKILL.md @@ -0,0 +1,71 @@ +--- +name: app-testing +description: Create a tailored app test checklist and then run in-depth testing for web apps, mobile apps, desktop apps, internal tools, and API-backed products. Use when Codex is asked to QA an app, test a feature or release, perform smoke, regression, or end-to-end testing, hunt bugs, validate UX or edge cases, or systematically explore product behavior before sign-off. +--- + +# App Testing + +## Overview + +Create a context-specific checklist before interacting with the app. Then execute deep testing against that checklist, expand coverage when risk appears, and report findings with reproduction steps, impact, and residual risk. + +## Workflow + +1. Gather context fast + - Inspect the repo, run instructions, routes or screens, auth roles, data model, feature flags, recent changes, and existing tests. + - Identify what "app" means in context: web UI, mobile app, desktop app, API-backed workflow, or a mixed system. + - Note blockers early: missing credentials, unavailable services, absent fixtures, or platform limits. + +2. Build the checklist first + - Start from `references/checklist-template.md`. + - Keep only relevant sections and add app-specific flows. + - Include both requested scope and nearby regression risk. + - Make the checklist visible in the response or working notes before deep testing. + - Mark each item as `pending`, `passed`, `failed`, or `blocked` as testing progresses. + +3. Run testing in deliberate passes + - Start with a smoke pass to confirm the app boots and the main entry path works. + - Cover primary user journeys end to end before spending time on polish issues. + - Run negative and edge-case passes after the happy path is stable. + - Validate integrations, persistence, permissions, and state transitions. + - Finish with broader quality passes such as responsiveness, accessibility, security sanity checks, and performance observations when applicable. + +4. Go deep when issues appear + - Minimize repro steps. + - Check whether the problem is isolated or systemic. + - Probe adjacent states, roles, inputs, and recovery paths. + - Record the smallest reliable reproduction and the broadest credible impact. + - Use `references/test-depth-guide.md` for additional heuristics. + +5. Report outcomes clearly + - List findings ordered by severity. + - For each confirmed issue include: title, severity, setup or account used, steps to reproduce, expected result, actual result, and evidence. + - Separate confirmed bugs from weak signals, assumptions, and untested areas. + - End with checklist coverage, blocked items, and the highest remaining risks. + +## Coverage Priorities + +Prefer this order unless the user gives tighter scope: + +1. App start-up and environment sanity +2. Authentication, authorization, and role boundaries +3. Core value paths +4. Data creation, editing, deletion, and persistence +5. Validation, empty states, and error handling +6. Navigation, back, refresh, retry behavior, and session continuity +7. Integrations, background jobs, uploads, downloads, webhooks, or payments +8. Responsive, accessibility, localization, timezone, and browser or device differences +9. Security and performance sanity checks + +## Testing Tactics + +- Use the same tools the app uses in real life: local dev servers, seeded data, logs, network panels, CLI scripts, and database inspection. +- Cross-check UI claims against API responses or stored state when possible. +- Prefer representative, risk-based coverage over exhaustive but shallow clicking. +- Do not invent coverage. Call out what you could not test and why. +- If the user asks for a review, prioritize concrete findings over narrative. + +## References + +- Read `references/checklist-template.md` when preparing the first-pass checklist. +- Read `references/test-depth-guide.md` when broadening from smoke testing into deeper exploratory and regression coverage. diff --git a/skills/.experimental/app-testing/agents/openai.yaml b/skills/.experimental/app-testing/agents/openai.yaml new file mode 100644 index 00000000..87c6037d --- /dev/null +++ b/skills/.experimental/app-testing/agents/openai.yaml @@ -0,0 +1,6 @@ +interface: + display_name: "App Testing" + short_description: "Create test checklists and run deep app testing" + icon_small: "./assets/favicon.svg" + icon_large: "./assets/favicon.png" + default_prompt: "Use $app-testing to build a checklist and test this app thoroughly." diff --git a/skills/.experimental/app-testing/assets/favicon.png b/skills/.experimental/app-testing/assets/favicon.png new file mode 100644 index 00000000..1cac19ae Binary files /dev/null and b/skills/.experimental/app-testing/assets/favicon.png differ diff --git a/skills/.experimental/app-testing/assets/favicon.svg b/skills/.experimental/app-testing/assets/favicon.svg new file mode 100644 index 00000000..2bd485d9 --- /dev/null +++ b/skills/.experimental/app-testing/assets/favicon.svg @@ -0,0 +1,25 @@ + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/skills/.experimental/app-testing/references/checklist-template.md b/skills/.experimental/app-testing/references/checklist-template.md new file mode 100644 index 00000000..d91739c8 --- /dev/null +++ b/skills/.experimental/app-testing/references/checklist-template.md @@ -0,0 +1,65 @@ +# Checklist Template + +Copy only the sections that apply to the app under test. Add app-specific journeys, roles, integrations, and release risks. + +## Setup and access + +- [ ] Install and start the app successfully +- [ ] Confirm required environment variables, services, and fixtures +- [ ] Confirm available test accounts, roles, and permissions +- [ ] Confirm logs or errors are visible somewhere useful + +## Smoke pass + +- [ ] App loads without fatal errors +- [ ] Main route or landing screen renders correctly +- [ ] Basic navigation works +- [ ] Critical dependencies respond + +## Core user journeys + +- [ ] Primary persona can complete the main task end to end +- [ ] Data written in one step appears correctly in later steps +- [ ] Refresh or reopen does not corrupt the flow +- [ ] Success states are clear and trustworthy + +## Inputs and validation + +- [ ] Required fields enforce constraints +- [ ] Boundary values behave correctly +- [ ] Invalid formats show useful errors +- [ ] Duplicate submissions and rapid repeat actions are handled safely + +## Permissions and roles + +- [ ] Signed-out behavior is correct +- [ ] Low-privilege users cannot access restricted actions +- [ ] Role changes take effect correctly +- [ ] Unauthorized API calls fail safely + +## State and resilience + +- [ ] Retry, cancel, back, and refresh behave safely +- [ ] Slow or failed network calls surface actionable feedback +- [ ] Partial failures do not leave corrupted state +- [ ] Concurrent edits or repeated requests are handled correctly + +## Integrations and assets + +- [ ] External services return expected outcomes or fail safely +- [ ] Uploads, downloads, payments, emails, or webhooks behave correctly +- [ ] Background jobs reflect status in the app +- [ ] Audit trails or logs capture important actions when relevant + +## Broader quality + +- [ ] Layout works on target device sizes and browsers +- [ ] Keyboard and screen-reader basics are usable +- [ ] Dates, currency, locale, and timezone behavior are correct +- [ ] Performance is acceptable on critical flows + +## Closeout + +- [ ] Findings are documented with repro steps and impact +- [ ] Failed and blocked items are called out explicitly +- [ ] Remaining risk areas are listed diff --git a/skills/.experimental/app-testing/references/test-depth-guide.md b/skills/.experimental/app-testing/references/test-depth-guide.md new file mode 100644 index 00000000..169d3259 --- /dev/null +++ b/skills/.experimental/app-testing/references/test-depth-guide.md @@ -0,0 +1,65 @@ +# Deep Testing Guide + +Use this after the checklist exists and the smoke pass is complete. + +## Expand around every important flow + +For each important journey, probe: + +- Entry +- Success +- Failure +- Retry +- Cancel +- Refresh or reopen +- Timeout or slow dependency +- Duplicate submit +- Permission change +- Stale session or expired token + +## Web app heuristics + +- Check deep links, browser back and forward, reload, and tab restore. +- Check form preservation, loading states, and optimistic UI. +- Check upload and download flows, clipboard behavior, and keyboard navigation. +- Check responsive layouts, overflow, focus handling, and error visibility. +- Check cache, local storage, and session transitions across tabs when relevant. + +## Mobile and desktop heuristics + +- Check relaunch, background and foreground transitions, and interrupted flows. +- Check poor network behavior, offline states, and recovery. +- Check OS permissions such as files, camera, notifications, or location when relevant. +- Check device-specific layout, scaling, and input behavior. + +## API-backed and data-heavy heuristics + +- Check idempotency, retries, pagination, sorting, and filtering. +- Check stale reads, race conditions, and partial writes. +- Check webhook retries, duplicate events, and eventual consistency. +- Check validation on both client and server boundaries. + +## Data integrity checks + +- Verify user-visible state against persisted state when possible. +- Check create, edit, delete, rollback, and cross-role visibility paths. +- Look for duplicate records, orphaned records, mismatched counts, or stuck jobs. + +## Severity calibration + +- Critical: data loss, auth bypass, payment or security failure, or app unusable. +- High: main flow blocked, incorrect persistent state, or major integration failure. +- Medium: degraded flow with workaround, incorrect validation, or notable UX break. +- Low: copy, layout, or polish issue with minor impact. + +## Reporting minimum + +Include: + +- Environment or build under test +- Preconditions +- Steps to reproduce +- Expected result +- Actual result +- Evidence +- Scope or suspected blast radius