-
Notifications
You must be signed in to change notification settings - Fork 367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Return plan for extraction #389
Comments
This is really interesting; definitely something we'll take a look at. Adding @seanmcguire12 here, who's spent considerable time in making extract better |
Would be nice if there was some sort of fallback, so first time the LLM is used, second time it uses the pre-determined path, but if that failed then it tried the LLM again. That would protect against DOM changes breaking the predetermined path. |
@danthegoodman1 Thank you for adding this issue! I think you bring up a great point, & we've been chatting about this internally as of late. I think the tough part about this is finding a way to balance resilience to website changes with the cost of re-processing the DOM with every extract call. One possible approach is hashing the state of the DOM, and then re-hashing it & comparing it with the most recent hash to detect whether it has changed... If it has changed, then we proceed with re-processing the DOM from scratch as usual. Afaik, the core challenge of this approach is generalizing a hashing function that will be specifically focused on the parts of the DOM that are relevant to the extraction task. e.g., if a webpage has elements a, b, c, x, y, z, and you you need to extract data from elements x, y, and z, we dont really care if a, b, and c change... but how do we build a hashing function such that the hash will only change if the relevant elements/candidate elements change? Again, thanks for calling this out -- reducing cost & increasing accuracy are two things that are very high on our priority list right now. Would love to hear any additional thoughts/opinions you may have on tackling this issue |
@drewbaker thank you for your comment!! To echo what I mentioned above: definitely keen on the idea reuse and reducing calls to an LLM. Can you explain more on what you mean by a pre-determined path here? Dynamically evaluating whether extraction has failed is definitely challenging to generalize beyond a specific website where expected values are known... i.e., how do we know when we should re-process the DOM/feed it back into the LLM? |
I mean the “the static code/parsing ” that OP is referring to. Perhaps the “has failed” state can be determined by a function. For example, our use case is to login and download a statement and return that statement as JSON. So if the function fails to return JSON of a certain shape, then we know that it failed. For example: extract("find the price of pad thai", (result)=>{ |
@drewbaker do you suggest that a user of stagehand would themselves define what "a failed state of extract" means for a particular use case? |
This is what I would expect. Naive version might be giving it methods |
Or they could operate independently, but if you have 1000 instances running that could get pretty chaotic, but maybe that's a separate problem domain |
Yes, in code I'd write something like this:
An alternative idea would be to have a new function that get returns the non-LLM recipe that the LLM found.
And the returned result would be a Playwright generated function that could be feed into
Then |
The extract functionality is really cool, but it would be even cooler if it could export the static code/parsing that it did so it can be reused across many instances without wastefully invoking the LLM again.
For example in the provided loom video where you extract the description of the Github repo, but had to do 100 more, it would be wasteful to have the AI do that 100 times, when it could figure out how to do it once and then the developer can have that action repeated on many pages.
The text was updated successfully, but these errors were encountered: