Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for sending traffic via HTTP proxies #123

Open
kirillklimenko opened this issue Oct 17, 2023 · 8 comments
Open

Support for sending traffic via HTTP proxies #123

kirillklimenko opened this issue Oct 17, 2023 · 8 comments
Labels
enhancement New feature or request

Comments

@kirillklimenko
Copy link

kirillklimenko commented Oct 17, 2023

Please add --proxy parameter to CLI interface (https://playwright.dev/python/docs/network#http-proxy):

cli.py

def proxy_option(fn):
    click.option("--proxy", help="HTTP Proxy to use")(fn)
    return fn


# other options
@proxy_option
def shot(...):
    ...


def _browser_context(
    p,
    auth,
    interactive=False,
    devtools=False,
    retina=False,
    browser="chromium",
    user_agent=None,
    proxy=None,
    timeout=None,
    reduced_motion=False,
):
    browser_kwargs = dict(headless=not interactive, devtools=devtools, proxy=proxy)
    if browser == "chromium":
        browser_obj = p.chromium.launch(**browser_kwargs)
    elif browser == "firefox":
        browser_obj = p.firefox.launch(**browser_kwargs)
    elif browser == "webkit":
        browser_obj = p.webkit.launch(**browser_kwargs)
    else:
        browser_kwargs["channel"] = browser
        browser_obj = p.chromium.launch(**browser_kwargs)
    context_args = {}
    if auth:
        context_args["storage_state"] = json.load(auth)
    if retina:
        context_args["device_scale_factor"] = 2
    if reduced_motion:
        context_args["reduced_motion"] = "reduce"
    if user_agent is not None:
        context_args["user_agent"] = user_agent
    if proxy is not None:
        context_args["proxy"] = proxy
    context = browser_obj.new_context(**context_args)
    if timeout:
        context.set_default_timeout(timeout)
    return context, browser_obj
@simonw simonw added the enhancement New feature or request label Nov 1, 2023
@simonw
Copy link
Owner

simonw commented Nov 1, 2023

I'm happy to add this but I'm not sure how best to test this. Do you have any guidance on an easy way to test that this proxy option is working? Maybe an existing proxy I can run this through, or a proxy server that's easy to get running on macOS?

@kirillklimenko
Copy link
Author

Sure, you can find free proxies on https://free-proxy-list.net/ and test the parameter with them.

@manugarri
Copy link

@simonw is there any plan to add this? seems like an easy PR that would add a ton of functionality

@simonw
Copy link
Owner

simonw commented Sep 14, 2024 via email

@simonw simonw changed the title Add --proxy parameter to CLI interface Support for sending traffic via HTTP proxies Feb 20, 2025
@simonw
Copy link
Owner

simonw commented Feb 20, 2025

I built a prototype of this in a branch: feaefdf

Used like this:

shot-scraper har https://doge.gov/savings -o savings.zip --wait 2000 \
  --proxy-server $PROXY_SERVER --proxy-username $PROXY_USERNAME --proxy-password $PROXY_PASSWORD

Still needs more testing, and needs to be added to a bunch of other commands, plus documentation.

@simonw
Copy link
Owner

simonw commented Feb 20, 2025

One thing I haven't quite figured out is the difference between http:// and https:// proxies and if that matters.

I've been testing with a --proxy-server address of 'https://brd.superproxy.io:33335'.

Also I had to do this to get anything to work:

if proxy_server:
proxy = {"server": proxy_server}
if proxy_username:
proxy["username"] = proxy_username
if proxy_password:
proxy["password"] = proxy_password
context_args["proxy"] = proxy
# Not sure why I needed this, figure that out!
context_args["ignore_https_errors"] = True

I'd like to understand if that ignore_https_errors thing is needed.

@simonw simonw pinned this issue Feb 20, 2025
@simonw simonw unpinned this issue Feb 20, 2025
@simonw simonw pinned this issue Feb 20, 2025
@simonw
Copy link
Owner

simonw commented Feb 20, 2025

@simonw
Copy link
Owner

simonw commented Feb 20, 2025

I am going to add environment, variable support here, so you can set SHOT_SCRAPER_PROXY_SERVER once and then forget about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants