Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

throw Exception if natural results is not found #59

Open
thebennos opened this issue Apr 23, 2017 · 3 comments
Open

throw Exception if natural results is not found #59

thebennos opened this issue Apr 23, 2017 · 3 comments
Assignees
Milestone

Comments

@thebennos
Copy link

The natural results are the essential part of the result page.
I think it would be good to make a basic check on each request to look up that the natural results are found .

If google does some changes in the html and the organic results are not found anymore the exceptions is thrown.

@gsouf
Copy link
Member

gsouf commented Apr 23, 2017

@thebennos I understand the goal but that's a very opinionated feature and it might cause undesirable behavior of the library.

I think the simplest solution for the moment would be to do this check by yourself when getting the results back from the GoogleClient. You can then manage it the way you want it to be managed. For instance you can stop to process the results if you detect that there are less than 9 results.

Anyway that's a feature that might be optional and that can be activated, but there are several things to take in considerations because depending on the elements present on the page google might return less than 10 results. That's worth some thinking for the next releases.

As an addition I can let you know that it's on the internal todo list to create a small application that will continuously parse google serps in order to detect a change as soon as it arrives. I cannot give a date for this feature because it's only an idea for the moment.

@thebennos
Copy link
Author

hey

Not exactly what I have i mind.
I think it is not needed, todo a full organic results parsing.

I had in mind, to check an xpath or an CSS ID or Class to make sure that the organic results can be parsed and to avoid problem like we had in
#56

where a little change makes suspicous problems.

With a little check and if it fails throw an exception, we would make it more clear.

@gsouf
Copy link
Member

gsouf commented Apr 23, 2017

@thebennos, I'm not sure to understand your proposal. Can you elaborate please?

A few checks are already done to know if the page is parseable. The issue with #56 is that the parser was not able to parse some results because of the google change and the parser considered them as not result items. The point with google change is that anything can change, the structure, a class, an id, really anything...

Being said I think it's possible to add some rules based on the experience with the library.

@gsouf gsouf added this to the v0.3.0 milestone Apr 25, 2017
@gsouf gsouf self-assigned this Apr 25, 2017
@gsouf gsouf modified the milestones: v0.2.1, v0.3.0 May 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants