V2EX = way to explore
V2EX 是一个关于分享和探索的地方
Sign Up Now
For Existing Member  Sign In
推荐学习书目
Learn Python the Hard Way
Python Sites
PyPI - Python Package Index
http://diveintopython.org/toc/index.html
Pocoo
值得关注的项目
PyPy
Celery
Jinja2
Read the Docs
gevent
pyenv
virtualenv
Stackless Python
Beautiful Soup
结巴中文分词
Green Unicorn
Sentry
Shovel
Pyflakes
pytest
Python 编程
pep8 Checker
Styles
PEP 8
Google Python Style Guide
Code Style from The Hitchhiker's Guide
wersonliu9527
V2EX  ›  Python

萌新又来了 scrapy 启动时能不能传参数

  •  
  •   wersonliu9527 · Sep 16, 2019 · 3436 views
    This topic created in 2427 days ago, the information mentioned may be changed or developed.
     import scrapy
    class ExampleSpider(scrapy.Spider):
        name = 'baidu.com'
        allowed_domains = ['www.baidu.com']
    
        # start_urls = ['https://www.baidu.com/']
    
        def __init__(self, key):
            super(ExampleSpider, self).__init__()
            self.key = key
    
        def start_requests(self):
            url = f'https://www.baidu.com/s?wd={self.key}'
    
            yield scrapy.Request(url=url, callback=self.mparse)
    
        def mparse(self, response):
            yield {
                'title': response.xpath('//title/text()').extract_first()
            }
    
    

    这样传递参数似乎不行

    from scrapy.crawler import CrawlerProcess
    
    from test_spider.spiders.example import ExampleSpider
    
    process = CrawlerProcess() 
    
    process.crawl(ExampleSpider(key='ip'))
    process.start()
    
    
    2 replies    2019-09-17 09:30:19 +08:00
    IanPeverell
        1
    IanPeverell  
       Sep 16, 2019   ❤️ 1
    这种情况可以直接用
    process = CrawlerProcess(settings={"key":"ip"})
    然后在爬虫里用 self.setting.get("key") 获取
    wersonliu9527
        2
    wersonliu9527  
    OP
       Sep 17, 2019
    @IanPeverell 感谢
    About   ·   Help   ·   Advertise   ·   Blog   ·   API   ·   FAQ   ·   Solana   ·   3038 Online   Highest 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 41ms · UTC 13:35 · PVG 21:35 · LAX 06:35 · JFK 09:35
    ♥ Do have faith in what you're doing.