运行scrapy crawler woaidu之后，卡住不动了 #14

MRLuowen · 2014-07-10T08:34:43Z

/home/lw/distribute_crawler-master/woaidu_crawler/woaidu_crawler/spiders/woaidu_detail_spider.py:12: ScrapyDeprecationWarning: woaidu_crawler.spiders.woaidu_detail_spider.WoaiduSpider inherits from deprecated class scrapy.spider.BaseSpider, please inherit from scrapy.spider.Spider. (warning only on first subclass, there may be others)
class WoaiduSpider(BaseSpider):
/usr/local/lib/python2.7/dist-packages/scrapy/contrib/pipeline/init.py:21: ScrapyDeprecationWarning: ITEM_PIPELINES defined as a list or a set is deprecated, switch to a dict
category=ScrapyDeprecationWarning, stacklevel=1)
/home/lw/distribute_crawler-master/woaidu_crawler/woaidu_crawler/spiders/woaidu_detail_spider.py:19: ScrapyDeprecationWarning: scrapy.selector.HtmlXPathSelector is deprecated, instantiate scrapy.Selector instead.
response_selector = HtmlXPathSelector(response)
/home/lw/distribute_crawler-master/woaidu_crawler/woaidu_crawler/spiders/woaidu_detail_spider.py:20: ScrapyDeprecationWarning: Call to deprecated function select. Use .xpath() instead.
next_link = list_first_item(response_selector.select(u'//div[@Class="k2"]/div/a[text()="下一页"]/@href').extract())
/usr/local/lib/python2.7/dist-packages/scrapy/selector/unified.py:106: ScrapyDeprecationWarning: scrapy.selector.HtmlXPathSelector is deprecated, instantiate scrapy.Selector instead.
for x in result]
/home/lw/distribute_crawler-master/woaidu_crawler/woaidu_crawler/spiders/woaidu_detail_spider.py:25: ScrapyDeprecationWarning: Call to deprecated function select. Use .xpath() instead.
for detail_link in response_selector.select(u'//div[contains(@Class,"sousuolist")]/a/@href').extract():
/home/lw/distribute_crawler-master/woaidu_crawler/woaidu_crawler/spiders/woaidu_detail_spider.py:33: ScrapyDeprecationWarning: scrapy.selector.HtmlXPathSelector is deprecated, instantiate scrapy.Selector instead.
response_selector = HtmlXPathSelector(response)
/home/lw/distribute_crawler-master/woaidu_crawler/woaidu_crawler/spiders/woaidu_detail_spider.py:34: ScrapyDeprecationWarning: Call to deprecated function select. Use .xpath() instead.
woaidu_item['book_name'] = list_first_item(response_selector.select('//div[@Class="zizida"][1]/text()').extract())
/home/lw/distribute_crawler-master/woaidu_crawler/woaidu_crawler/spiders/woaidu_detail_spider.py:35: ScrapyDeprecationWarning: Call to deprecated function select. Use .xpath() instead.
woaidu_item['author'] = [list_first_item(response_selector.select('//div[@Class="xiaoxiao"][1]/text()').extract())[5:].strip(),]
/home/lw/distribute_crawler-master/woaidu_crawler/woaidu_crawler/spiders/woaidu_detail_spider.py:36: ScrapyDeprecationWarning: Call to deprecated function select. Use .xpath() instead.
woaidu_item['book_description'] = list_first_item(response_selector.select('//div[@Class="lili"][1]/text()').extract()).strip()
/home/lw/distribute_crawler-master/woaidu_crawler/woaidu_crawler/spiders/woaidu_detail_spider.py:37: ScrapyDeprecationWarning: Call to deprecated function select. Use .xpath() instead.
woaidu_item['book_covor_image_url'] = list_first_item(response_selector.select('//div[@Class="hong"][1]/img/@src').extract())
/home/lw/distribute_crawler-master/woaidu_crawler/woaidu_crawler/spiders/woaidu_detail_spider.py:40: ScrapyDeprecationWarning: Call to deprecated function select. Use .xpath() instead.
for i in response_selector.select('//div[contains(@Class,"xiazai_xiao")]')[1:]:
/home/lw/distribute_crawler-master/woaidu_crawler/woaidu_crawler/spiders/woaidu_detail_spider.py:46: ScrapyDeprecationWarning: Call to deprecated function select. Use .xpath() instead.
list_first_item(i.select('./div')[0].select('./a/@href').extract()),
/home/lw/distribute_crawler-master/woaidu_crawler/woaidu_crawler/spiders/woaidu_detail_spider.py:47: ScrapyDeprecationWarning: Call to deprecated function select. Use .xpath() instead.
list_first_item(i.select('./div')[1].select('./a/@href').extract())
/home/lw/distribute_crawler-master/woaidu_crawler/woaidu_crawler/spiders/woaidu_detail_spider.py:52: ScrapyDeprecationWarning: Call to deprecated function select. Use .xpath() instead.
download_item['progress'] = list_first_item(i.select('./div')[2].select('./text()').extract())
/home/lw/distribute_crawler-master/woaidu_crawler/woaidu_crawler/spiders/woaidu_detail_spider.py:53: ScrapyDeprecationWarning: Call to deprecated function select. Use .xpath() instead.
download_item['update_time'] = list_first_item(i.select('./div')[3].select('./text()').extract())
/home/lw/distribute_crawler-master/woaidu_crawler/woaidu_crawler/spiders/woaidu_detail_spider.py:56: ScrapyDeprecationWarning: Call to deprecated function select. Use .xpath() instead.
list_first_item(i.select('./div')[4].select('./a/text()').extract()),
/home/lw/distribute_crawler-master/woaidu_crawler/woaidu_crawler/spiders/woaidu_detail_spider.py:57: ScrapyDeprecationWarning: Call to deprecated function select. Use .xpath() instead.
list_first_item(i.select('./div')[4].select('./a/@href').extract())\

TylerzhangZC · 2015-03-15T07:50:10Z

请问后来如何解决的？有方案吗？

zhuang1992 · 2015-03-18T21:22:14Z

I have the same problems.

eyrelzy · 2015-03-18T21:24:22Z

image
iders.woaidu_detail_spider.WoaiduSpider inherits from deprecated class scrapy.spider.BaseSpider, please inherit from scrapy.spider.Spider. (warning only on first subclass, there may be others)
class WoaiduSpider(BaseSpider):
卡在这里不执行了，有解决方案么？

TylerzhangZC · 2015-03-20T00:15:14Z

follow this changelist,sync the code,it will be work normally:
https://github.com/gnemoug/distribute_crawler/pull/5/files

georgezouq · 2016-07-21T07:03:19Z

@TylerzhangZC I change to branch pr/5 and run it,It still has the error：

Traceback (most recent call last):
  File "/Library/Python/2.7/site-packages/scrapy/cmdline.py", line 150, in _run_command
    cmd.run(args, opts)
  File "/Library/Python/2.7/site-packages/scrapy/commands/crawl.py", line 57, in run
    self.crawler_process.crawl(spname, **opts.spargs)
  File "/Library/Python/2.7/site-packages/scrapy/crawler.py", line 153, in crawl
    d = crawler.crawl(*args, **kwargs)
  File "/Library/Python/2.7/site-packages/twisted/internet/defer.py", line 1274, in unwindGenerator
    return _inlineCallbacks(None, gen, Deferred())
--- <exception caught here> ---
  File "/Library/Python/2.7/site-packages/twisted/internet/defer.py", line 1128, in _inlineCallbacks
    result = g.send(result)
  File "/Library/Python/2.7/site-packages/scrapy/crawler.py", line 71, in crawl
    self.engine = self._create_engine()
  File "/Library/Python/2.7/site-packages/scrapy/crawler.py", line 83, in _create_engine
    return ExecutionEngine(self, lambda _: self.stop())
  File "/Library/Python/2.7/site-packages/scrapy/core/engine.py", line 69, in __init__
    self.scraper = Scraper(crawler)
  File "/Library/Python/2.7/site-packages/scrapy/core/scraper.py", line 70, in __init__
    self.itemproc = itemproc_cls.from_crawler(crawler)
  File "/Library/Python/2.7/site-packages/scrapy/middleware.py", line 56, in from_crawler
    return cls.from_settings(crawler.settings, crawler)
  File "/Library/Python/2.7/site-packages/scrapy/middleware.py", line 32, in from_settings
    mwcls = load_object(clspath)
  File "/Library/Python/2.7/site-packages/scrapy/utils/misc.py", line 44, in load_object
    mod = import_module(module)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/importlib/__init__.py", line 37, in import_module
    __import__(name)
  File "/Users/georgezou/Documents/Coding/github/distribute_crawler/woaidu_crawler/woaidu_crawler/pipelines/cover_image.py", line 7, in <module>
    from scrapy.contrib.pipeline.images import ImagesPipeline
  File "/Library/Python/2.7/site-packages/scrapy/contrib/pipeline/images.py", line 7, in <module>
    from scrapy.pipelines.images import *
  File "/Library/Python/2.7/site-packages/scrapy/pipelines/images.py", line 15, in <module>
    from PIL import Image

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

运行scrapy crawler woaidu之后，卡住不动了 #14

运行scrapy crawler woaidu之后，卡住不动了 #14

MRLuowen commented Jul 10, 2014

TylerzhangZC commented Mar 15, 2015

zhuang1992 commented Mar 18, 2015

eyrelzy commented Mar 18, 2015

TylerzhangZC commented Mar 20, 2015

georgezouq commented Jul 21, 2016

运行scrapy crawler woaidu之后，卡住不动了 #14

运行scrapy crawler woaidu之后，卡住不动了 #14

Comments

MRLuowen commented Jul 10, 2014

TylerzhangZC commented Mar 15, 2015

zhuang1992 commented Mar 18, 2015

eyrelzy commented Mar 18, 2015

TylerzhangZC commented Mar 20, 2015

georgezouq commented Jul 21, 2016