Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructure crawlers #47

Merged
merged 11 commits into from
Jan 16, 2023
Merged

Restructure crawlers #47

merged 11 commits into from
Jan 16, 2023

Conversation

tweska
Copy link
Collaborator

@tweska tweska commented Jun 25, 2022

I wrote a decorator to automatically register crawler functions and slightly simplified the crawler code and data structure. I believe this will make adding new crawlers to Argostimè easier for potential new contributors.

This pull request will also close #7 since you can disable shops using the configuration file.

I am currently writing documentation on how to write a new crawler for this version of Argostimè, then we can tick off #32 as well.

@tweska tweska requested a review from m-rtijn June 25, 2022 16:33
Copy link
Owner

@m-rtijn m-rtijn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ziet er goed uit, heb het zelf nog niet gedraaid maar als het bij jou werkt zal het wel prima zijn. Wat me wel een beetje onduidelijk is, is wanneer enabled_shops nou daadwerkelijk gevuld wordt - gewoon eenmalig op startup neem ik aan? Kan niet zo goed een duidelijke call vinden die dat doet

@tweska
Copy link
Collaborator Author

tweska commented Jun 26, 2022

Ziet er goed uit, heb het zelf nog niet gedraaid maar als het bij jou werkt zal het wel prima zijn.

Famous last words...

Wat me wel een beetje onduidelijk is, is wanneer enabled_shops nou daadwerkelijk gevuld wordt - gewoon eenmalig op startup neem ik aan? Kan niet zo goed een duidelijke call vinden die dat doet

Ja, eenmalig op startup. Iets precieser als je deze import doet:

from argostime.crawler.shop import *

dan import je __all__ wat gedefineerd is in argostime.crawler.shop.__init__.py. En als je deze import doet worden de decorators @register_crawler(name: str, host: str) aangeroepen. Die decorator is in argostime.crawler.crawl_utils.py gedefineerd en maakt de entry in enabled_shops aan (tenzij de shop disabled is).

Fix intergamma crawler, could not be called previously

Set sale flag in ah crawler for discounted products

Convert price to float in hema crawler

Set sale flag in steam crawler for discounted products
@tweska tweska force-pushed the crawler-restructure branch from 522e46f to f19d4c6 Compare June 26, 2022 12:48
@tweska tweska requested a review from m-rtijn June 26, 2022 12:49
@tweska tweska requested a review from m-rtijn June 27, 2022 19:08
@tweska tweska requested a review from m-rtijn November 30, 2022 20:00
Copy link
Owner

@m-rtijn m-rtijn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added one minor nitpick. I think it's OK for merge except for one issue: the logger does not seem to work at all on this branch in my testing environment; which is rather odd since you didn't seem to change anything regarding the logger. Is it working for you @tweska?

Logger must be configured before it is called anywhere in the project.
@m-rtijn m-rtijn merged commit 826a3bc into m-rtijn:master Jan 16, 2023
@tweska tweska deleted the crawler-restructure branch January 16, 2023 22:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Read enabled stores from configuration file
2 participants