A comprehensive end-to-end testing framework built with Stagehand and pytest for automated browser testing using natural language instructions.
This framework enables reliable browser automation by combining the power of AI-driven natural language commands with traditional code-based testing. It's specifically configured for testing the TransGlobal website (https://www.transglobalus.com/) with support for multiple device types, parallel execution, and automatic retry mechanisms.
⚠️ IMPORTANT BROWSER LIMITATION: Stagehand ONLY supports Chromium/Chrome browsers. Firefox and Safari are NOT supported and will not work with this framework.
- Natural Language Testing: Write tests using plain English instructions
- Multi-Device Support: Test on mobile, iPad, and desktop viewports
- Parallel Execution: Run tests in parallel using pytest-xdist
- Automatic Retry: Failed tests automatically retry with configurable attempts
- Flexible Tagging: Organize tests using custom markers (tags)
- Production-Ready: Designed for reliable CI/CD integration
- Prerequisites
- Installation
- Configuration
- Running Tests
- Best Practices
- Coding Standards
- Troubleshooting
Before you begin, ensure you have the following installed:
- Python 3.8+ (Python 3.10+ recommended)
- pip or uv (package manager)
- OpenAI API Key (required for Stagehand)
- Git (for cloning the repository)
- macOS, Linux, or Windows
- At least 2GB of free disk space
- Internet connection for API calls and browser downloads
git clone <repository-url>
cd StageHand-E2E-AutomationUsing venv (recommended):
python3 -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activateOr using uv (faster alternative):
uv venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activatepip install -r requirements.txtOr with uv:
uv pip install -r requirements.txt
⚠️ CRITICAL: Stagehand ONLY supports Chromium/Chrome browsers. Firefox and Safari are NOT supported. Do not attempt to install or use other browsers.
Install Chromium browser for local execution:
python -m playwright install chromiumThis will download and install the Chromium browser required by Stagehand for browser automation.
- Copy the example environment file:
cp .env.example .env- Edit
.envand add your OpenAI API key:
OPENAI_API_KEY=your_actual_openai_api_key_hereImportant: Never commit the .env file to version control. It's already included in .gitignore.
Run a simple test to verify everything is set up correctly:
pytest tests/pages/homepage/test_homepage.py::test_homepage_loads -vIf the test runs successfully, you're ready to go! 🎉
The framework supports three device configurations:
- mobile: 430x932 (iPhone 15 Pro Max size)
- ipad: 1024x1366 (iPad Pro 12.9" size)
- desktop: 1920x1080 (default)
⚠️ BROWSER COMPATIBILITY WARNINGStagehand ONLY supports Chromium-based browsers:
- ✅ Chrome
- ✅ Chromium
- ✅ Microsoft Edge (Chromium-based)
NOT supported:
- ❌ Firefox
- ❌ Safari
- ❌ Any other non-Chromium browsers
The framework uses Playwright's Chromium browser for all test executions. Attempting to use unsupported browsers will result in errors.
The pytest.ini file contains all test configuration:
- Test Discovery: Automatically finds tests in the
tests/directory - Markers: Custom tags for organizing tests (configured in
pytest.ini) - Retry Settings: Default 2 retries with 1 second delay
- Logging: Configured for detailed test output
Run all tests:
pytestRun specific test file:
pytest tests/pages/homepage/test_homepage.pyRun specific test function:
pytest tests/pages/homepage/test_homepage.py::test_homepage_loadsRun tests on mobile device:
pytest --device=mobileRun tests on iPad:
pytest --device=ipadRun tests on desktop (default):
pytest --device=desktopRun only smoke tests:
pytest -m smokeRun critical tests:
pytest -m criticalRun homepage tests:
pytest -m homepageCombine multiple tags (OR logic):
pytest -m "smoke or critical"Combine multiple tags (AND logic):
pytest -m "smoke and homepage"Exclude specific tags:
pytest -m "not regression"Run tests in headless mode (no browser window):
pytest --headlessRun tests in parallel (faster execution):
pytest -n auto # Automatically detect CPU cores
pytest -n 4 # Use 4 workersRun smoke tests on mobile in headless mode with parallel execution:
pytest -m smoke --device=mobile --headless -n autoUse a different Stagehand model:
pytest --stagehand-model=gpt-4oGet detailed test output:
pytest -v # Verbose
pytest -vv # More verbose
pytest -s # Show print statementsThe default retry configuration is set in pytest.ini (2 retries with 1 second delay). You can override it:
pytest --reruns=3 --reruns-delay=2# Good
async def test_homepage_services_section_displays_correctly(stagehand_on_demand):
pass
# Bad
async def test1(stagehand_on_demand):
passAlways tag your tests appropriately:
@pytest.mark.homepage
@pytest.mark.smoke
async def test_homepage_loads(stagehand_on_demand):
pass# Good - specific
await page.act("click the 'Get Started' button in the hero section")
# Bad - vague
await page.act("click button")try:
await page.act("click the submit button")
except Exception as e:
# Log or handle the error appropriately
print(f"Action failed: {e}")
raiseUse observe to preview actions and cache them:
# Preview the action
action = await page.observe("click the navigation menu")
# Execute without additional LLM call
await page.act(action[0])For complex data, use Pydantic schemas:
from pydantic import BaseModel
class ServiceInfo(BaseModel):
title: str
description: str
link: str
services = await page.extract("all services", schema=ServiceInfo)Each test should be able to run independently:
# Good - each test navigates to the page
async def test_a(stagehand_on_demand):
await stagehand_on_demand.page.goto("https://www.transglobalus.com/")
# test code
async def test_b(stagehand_on_demand):
await stagehand_on_demand.page.goto("https://www.transglobalus.com/")
# test codeAlways use BaseActions for standard Playwright operations to maintain consistency and reusability:
from tests.pages.base.base_action import BaseActions
base_actions = BaseActions(page)
await base_actions.open_url("https://www.transglobalus.com/")
await base_actions.wait_for_page_loaded()
is_visible = await base_actions.verify_element_visible('selector')Always verify that page content has actually loaded, not just that the URL changed:
# Good - verifies content is loaded
await base_actions.wait_for_page_loaded()
current_url = page.url
assert "contact" in current_url.lower()
body_text = await base_actions.get_element_text("body")
assert len(body_text.strip()) > 0, "Page appears to be blank"
# Bad - only checks URL
current_url = page.url
assert "contact" in current_url.lower()This project follows Python best practices and coding principles. For detailed coding rules, see .cursor/rules/stagehand_coding_rules.mdc.
-
Naming Conventions:
- Modules:
snake_case(e.g.,test_header.py) - Classes:
PascalCase(e.g.,BaseActions,Device) - Functions/Methods:
snake_case(e.g.,navigate_homepage) - Constants:
UPPER_SNAKE_CASE(e.g.,DEFAULT_TIMEOUT) - Private methods: Prefix with
_(e.g.,_resolve_locator)
- Modules:
-
Code Formatting:
- Maximum 100 characters per line (soft limit)
- Use 4 spaces for indentation (never tabs)
- 2 blank lines between top-level definitions
- 1 blank line between methods
- Group imports: standard library, third-party, local
-
Import Organization:
# Standard library
import asyncio
from typing import Union
# Third-party
import pytest
from playwright.async_api import Page
from stagehand import Stagehand
# Local
from tests.pages.base.base_action import BaseActions
from config.devices import get_device_class- Single Responsibility: Each class/function should have one reason to change
- Open/Closed: Open for extension, closed for modification
- Liskov Substitution: Subtypes must be substitutable for their base types
- Interface Segregation: Keep interfaces focused and minimal
- Dependency Inversion: Depend on abstractions, not concretions
- Extract common functionality into reusable functions/classes
- Use
BaseActionsfor common Playwright operations - Create helper functions for repeated patterns
- Use fixtures for shared test setup
- Location:
features/{page_name}/{feature_name}.feature - Use Gherkin syntax (Given-When-Then)
- Tag scenarios appropriately (e.g.,
@homepage,@header_visibility)
- Location:
tests/pages/{page_name}/test_{feature_name}.py - Use
scenarios()function at the top to load feature files - Write all steps for each scenario together
- Do NOT prefix step definition functions with
given_,when_,then_
import pytest
from pytest_bdd import scenarios, given, when, then, parsers
from stagehand import Stagehand
from tests.pages.base.base_action import BaseActions
scenarios('../../../features/homepage/header.feature')
@given("I navigate to the TransGlobal homepage")
async def navigate_homepage_visibility(stagehand_on_demand: Stagehand):
page = stagehand_on_demand.page
base_actions = BaseActions(page)
await base_actions.open_url("https://www.transglobalus.com/")
await base_actions.wait_for_page_loaded()
@when("I look at the header")
async def look_at_header_visibility(stagehand_on_demand: Stagehand):
await stagehand_on_demand.page.wait_for_timeout(500)
@then("the TransGlobal logo should be visible")
async def logo_visible_visibility(stagehand_on_demand: Stagehand):
page = stagehand_on_demand.page
base_actions = BaseActions(page)
is_visible = await base_actions.verify_element_visible('a[href*="transglobalus.com"]')
assert is_visible- Use BaseActions for standard Playwright operations (click, wait, verify)
- Use
page.act()for natural language actions that require AI interpretation - Use
page.observe()only when necessary (preview actions before execution) - Use
page.extract()only when structured data extraction is needed - Be specific in natural language instructions
- File Structure: Organize tests by page/feature (e.g.,
tests/pages/homepage/test_header.py) - Naming: Test files use
test_{feature_name}.py, feature files use{feature_name}.feature - Tags: Use scenario-specific tags (e.g.,
@header_visibility,@header_click_contact) and page-level tags (e.g.,@homepage) - Independence: Each test should be able to run independently
- Functions: Keep functions small and focused (max 50 lines, prefer shorter)
- Variables: Use descriptive names, avoid abbreviations
- Type Hints: Use type hints for function parameters and return values
- Comments: Comment "why", not "what"
- Constants: Define constants at module level, avoid magic numbers
@given("I navigate to the TransGlobal homepage")
async def navigate_homepage(stagehand_on_demand: Stagehand):
page = stagehand_on_demand.page
base_actions = BaseActions(page)
await base_actions.open_url("https://www.transglobalus.com/")
await base_actions.wait_for_page_loaded()@when("I click the menu item")
async def click_menu_item(stagehand_on_demand: Stagehand, menu_item: str):
page = stagehand_on_demand.page
await page.act(f'click the "{menu_item}" in the header')
await page.wait_for_load_state("networkidle")
await page.wait_for_timeout(2000) # Wait for content to load
@then("I should be navigated to the page")
async def verify_navigation(stagehand_on_demand: Stagehand):
page = stagehand_on_demand.page
base_actions = BaseActions(page)
await base_actions.wait_for_page_loaded()
current_url = page.url
assert "expected-path" in current_url.lower()
# Verify page content is loaded
body_text = await base_actions.get_element_text("body")
assert len(body_text.strip()) > 0, "Page appears to be blank"❌ Don't Do This:
# Magic numbers
await page.wait_for_timeout(2000)
# Generic names
def test1(stagehand):
pass
# Code duplication
await page.locator('selector').wait_for(state="visible")
# Repeated in multiple places
# No page content verification
current_url = page.url
assert "contact" in current_url.lower()
# Prefixing with given/when/then
async def given_navigate_homepage(...):
pass✅ Do This Instead:
# Named constant or variable
PAGE_LOAD_DELAY = 2000
await page.wait_for_timeout(PAGE_LOAD_DELAY)
# Descriptive names
async def test_header_logo_visibility(stagehand):
pass
# Use BaseActions
base_actions = BaseActions(page)
is_visible = await base_actions.verify_element_visible('selector')
# Verify page content
base_actions = BaseActions(page)
await base_actions.wait_for_page_loaded()
body_text = await base_actions.get_element_text("body")
assert len(body_text.strip()) > 0, "Page appears to be blank"
# No prefix
async def navigate_homepage(...):
passFor complete coding rules and detailed guidelines, refer to .cursor/rules/stagehand_coding_rules.mdc.
Problem: Missing or incorrect API key in .env file.
Solution:
- Verify
.envfile exists in the project root - Check that
OPENAI_API_KEYis set correctly - Ensure
.envfile is not committed to git
Problem: Playwright browser not installed.
Solution:
python -m playwright install chromium
⚠️ WARNING: Only Chromium browser is supported. Do NOT attempt to install Firefox or Safari browsers as Stagehand does not support them.
Problem: Multiple tests trying to use the same port.
Solution: The framework automatically handles this with random ports. If issues persist, reduce the number of parallel workers:
pytest -n 2 # Instead of -n autoProblem: Tests taking too long or hanging.
Solution:
- Check your internet connection
- Verify the target website is accessible
- Increase timeout in test code if needed
- Check OpenAI API rate limits
Problem: Dependencies not installed.
Solution:
pip install -r requirements.txtProblem: Tests fail in headless mode but pass in headed mode.
Solution:
- Some websites behave differently in headless mode
- Try running without
--headlessflag first - Check if the website blocks headless browsers
Problem: Attempting to use Firefox or Safari, or getting errors about unsupported browsers.
⚠️ IMPORTANT: Stagehand ONLY supports Chromium/Chrome browsers. Firefox and Safari are NOT supported.
Solution:
- ✅ Ensure you have Chromium installed via
python -m playwright install chromium - ❌ Do NOT configure the framework to use Firefox or Safari
- ❌ Do NOT attempt to install Firefox or Safari browsers
- If you see browser compatibility errors, verify that Chromium is properly installed
- Check the Stagehand documentation
- Review pytest logs with
-vor-vvflags - Run tests with
-sflag to see print statements - Check browser console logs in non-headless mode
Run tests with maximum verbosity:
pytest -vvv -s --tb=longThis will show:
- Very verbose output
- Print statements
- Full traceback for failures
This project is licensed under the MIT License. See the LICENSE file for details.
- Stagehand Python Documentation
- Pytest Documentation
- Pytest-xdist Documentation
- Pytest-rerunfailures Documentation
Happy Testing! 🚀