feat: add new scraping and sitemap extraction tools to ScapeGraphClient #6
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Changes:
scrapemethod for basic scraping of page content.sitemapmethod to extract sitemap URLs for a given website.agentic_scrappermethod for running the Agentic Scraper workflow with flexible input handling.This enhances the functionality of the ScapeGraphClient, allowing for more versatile web scraping capabilities.
Note
Adds
scrape,sitemap, andagentic_scrapperclient methods and MCP tools with flexible input handling, and increases HTTP client timeout.src/scrapegraph_mcp/server.py):ScapeGraphClient.scrape(website_url, render_heavy_js?)-> POST/scrape.ScapeGraphClient.sitemap(website_url)-> POST/sitemap.ScapeGraphClient.agentic_scrapper(url, user_prompt?, output_schema?, steps?, ai_extraction?, persistent_session?, timeout_seconds?)-> POST/agentic-scrapper(supports per-request timeout).scrape(website_url, render_heavy_js?)andsitemap(website_url)wrappers with HTTP error handling.agentic_scrapper(...)wrapper with input normalization (acceptsstepsas string/list andoutput_schemaas dict/JSON string) and robust error/timeout handling.60stohttpx.Timeout(120s).jsonand extended typing (Optional,List,Union) to support new features.Written by Cursor Bugbot for commit c595975. This will update automatically on new commits. Configure here.