2016-06-07 51 views
2

我跟着the official guide,但得到这个错误信息:如何在Unbuntu 16.04上安装Scrapy?

The following packages have unmet dependencies: 
scrapy : Depends: python-support (>= 0.90.0) but it is not installable 
     Recommends: python-setuptools but it is not going to be installed 
E: Unable to correct problems, you have held broken packages. 

我又试图sudo apt-get python-support,却发现Ubuntu的16.04删除python-support

最后,我试图安装python-setuptools,但似乎它只会安装python2。

The following additional packages will be installed: 
libpython-stdlib libpython2.7-minimal libpython2.7-stdlib python 
python-minimal python-pkg-resources python2.7 python2.7-minimal 
Suggested packages: 
python-doc python-tk python-setuptools-doc python2.7-doc binutils 
binfmt-support 
The following NEW packages will be installed: 
libpython-stdlib libpython2.7-minimal libpython2.7-stdlib python 
python-minimal python-pkg-resources python-setuptools python2.7 
python2.7-minimal 

在Ubuntu 16.04的Python 3环境中,我应该怎么做Scrapy?谢谢。

回答

2

你应该是不错的搭配:

apt-get install -y \ 
    python3 \ 
    python-dev \ 
    python3-dev 

# for cryptography 
apt-get install -y \ 
    build-essential \ 
    libssl-dev \ 
    libffi-dev 

# for lxml 
apt-get install -y \ 
    libxml2-dev \ 
    libxslt-dev 

# install pip 
apt-get install -y python-pip 

这是一个例子Dockerfile来测试Python 3的安装scrapy,在Ubuntu 16.04/Xenial:

$ cat Dockerfile 
FROM ubuntu:xenial 

ENV DEBIAN_FRONTEND noninteractive 

RUN apt-get update 

# Install Python3 and dev headers 
RUN apt-get install -y \ 
    python3 \ 
    python-dev \ 
    python3-dev 

# Install cryptography 
RUN apt-get install -y \ 
    build-essential \ 
    libssl-dev \ 
    libffi-dev 

# install lxml 
RUN apt-get install -y \ 
    libxml2-dev \ 
    libxslt-dev 

# install pip 
RUN apt-get install -y python-pip 

RUN useradd --create-home --shell /bin/bash scrapyuser 

USER scrapyuser 
WORKDIR /home/scrapyuser 

然后,构建多克尔图像后并运行一个容器:

$ sudo docker build -t redapple/scrapy-ubuntu-xenial . 
$ sudo docker run -t -i redapple/scrapy-ubuntu-xenial 

你可以运行pip install scrapy

下面我用virtualenvwrapper创建一个Python 3的virtualenv:

[email protected]:~$ pip install --user virtualenvwrapper 
Collecting virtualenvwrapper 
    Downloading virtualenvwrapper-4.7.1-py2.py3-none-any.whl 
Collecting virtualenv-clone (from virtualenvwrapper) 
    Downloading virtualenv-clone-0.2.6.tar.gz 
Collecting stevedore (from virtualenvwrapper) 
    Downloading stevedore-1.14.0-py2.py3-none-any.whl 
Collecting virtualenv (from virtualenvwrapper) 
    Downloading virtualenv-15.0.2-py2.py3-none-any.whl (1.8MB) 
    100% |################################| 1.8MB 320kB/s 
Collecting pbr>=1.6 (from stevedore->virtualenvwrapper) 
    Downloading pbr-1.10.0-py2.py3-none-any.whl (96kB) 
    100% |################################| 102kB 1.5MB/s 
Collecting six>=1.9.0 (from stevedore->virtualenvwrapper) 
    Downloading six-1.10.0-py2.py3-none-any.whl 
Building wheels for collected packages: virtualenv-clone 
    Running setup.py bdist_wheel for virtualenv-clone ... done 
    Stored in directory: /home/scrapyuser/.cache/pip/wheels/24/51/ef/93120d304d240b4b6c2066454250a1626e04f73d34417b956d 
Successfully built virtualenv-clone 
Installing collected packages: virtualenv-clone, pbr, six, stevedore, virtualenv, virtualenvwrapper 
Successfully installed pbr six stevedore virtualenv virtualenv-clone virtualenvwrapper 
You are using pip version 8.1.1, however version 8.1.2 is available. 
You should consider upgrading via the 'pip install --upgrade pip' command. 
[email protected]:~$ source ~/.local/bin/virtualenvwrapper.sh 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/premkproject 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/postmkproject 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/initialize 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/premkvirtualenv 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/postmkvirtualenv 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/prermvirtualenv 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/postrmvirtualenv 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/predeactivate 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/postdeactivate 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/preactivate 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/postactivate 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/get_env_details 
[email protected]:~$ export PATH=$PATH:/home/scrapyuser/.local/bin 
[email protected]:~$ mkvirtualenv --python=/usr/bin/python3 scrapy11.py3 
Running virtualenv with interpreter /usr/bin/python3 
Using base prefix '/usr' 
New python executable in /home/scrapyuser/.virtualenvs/scrapy11.py3/bin/python3 
Also creating executable in /home/scrapyuser/.virtualenvs/scrapy11.py3/bin/python 
Installing setuptools, pip, wheel...done. 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/scrapy11.py3/bin/predeactivate 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/scrapy11.py3/bin/postdeactivate 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/scrapy11.py3/bin/preactivate 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/scrapy11.py3/bin/postactivate 
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/scrapy11.py3/bin/get_env_details 

和安装scrapy 1.1是pip install scrapy

(scrapy11.py3) [email protected]:~$ pip install scrapy 
Collecting scrapy 
    Downloading Scrapy-1.1.0-py2.py3-none-any.whl (294kB) 
    100% |################################| 296kB 1.0MB/s 
Collecting PyDispatcher>=2.0.5 (from scrapy) 
    Downloading PyDispatcher-2.0.5.tar.gz 
Collecting pyOpenSSL (from scrapy) 
    Downloading pyOpenSSL-16.0.0-py2.py3-none-any.whl (45kB) 
    100% |################################| 51kB 1.8MB/s 
Collecting lxml (from scrapy) 
    Downloading lxml-3.6.0.tar.gz (3.7MB) 
    100% |################################| 3.7MB 312kB/s 
Collecting parsel>=0.9.3 (from scrapy) 
    Downloading parsel-1.0.2-py2.py3-none-any.whl 
Collecting six>=1.5.2 (from scrapy) 
    Using cached six-1.10.0-py2.py3-none-any.whl 
Collecting Twisted>=10.0.0 (from scrapy) 
    Downloading Twisted-16.2.0.tar.bz2 (2.9MB) 
    100% |################################| 2.9MB 307kB/s 
Collecting queuelib (from scrapy) 
    Downloading queuelib-1.4.2-py2.py3-none-any.whl 
Collecting cssselect>=0.9 (from scrapy) 
    Downloading cssselect-0.9.1.tar.gz 
Collecting w3lib>=1.14.2 (from scrapy) 
    Downloading w3lib-1.14.2-py2.py3-none-any.whl 
Collecting service-identity (from scrapy) 
    Downloading service_identity-16.0.0-py2.py3-none-any.whl 
Collecting cryptography>=1.3 (from pyOpenSSL->scrapy) 
    Downloading cryptography-1.4.tar.gz (399kB) 
    100% |################################| 409kB 1.1MB/s 
Collecting zope.interface>=4.0.2 (from Twisted>=10.0.0->scrapy) 
    Downloading zope.interface-4.1.3.tar.gz (141kB) 
    100% |################################| 143kB 1.3MB/s 
Collecting attrs (from service-identity->scrapy) 
    Downloading attrs-16.0.0-py2.py3-none-any.whl 
Collecting pyasn1 (from service-identity->scrapy) 
    Downloading pyasn1-0.1.9-py2.py3-none-any.whl 
Collecting pyasn1-modules (from service-identity->scrapy) 
    Downloading pyasn1_modules-0.0.8-py2.py3-none-any.whl 
Collecting idna>=2.0 (from cryptography>=1.3->pyOpenSSL->scrapy) 
    Downloading idna-2.1-py2.py3-none-any.whl (54kB) 
    100% |################################| 61kB 2.0MB/s 
Requirement already satisfied (use --upgrade to upgrade): setuptools>=11.3 in ./.virtualenvs/scrapy11.py3/lib/python3.5/site-packages (from cryptography>=1.3->pyOpenSSL->scrapy) 
Collecting cffi>=1.4.1 (from cryptography>=1.3->pyOpenSSL->scrapy) 
    Downloading cffi-1.6.0.tar.gz (397kB) 
    100% |################################| 399kB 1.1MB/s 
Collecting pycparser (from cffi>=1.4.1->cryptography>=1.3->pyOpenSSL->scrapy) 
    Downloading pycparser-2.14.tar.gz (223kB) 
    100% |################################| 225kB 1.2MB/s 
Building wheels for collected packages: PyDispatcher, lxml, Twisted, cssselect, cryptography, zope.interface, cffi, pycparser 
    Running setup.py bdist_wheel for PyDispatcher ... done 
    Stored in directory: /home/scrapyuser/.cache/pip/wheels/86/02/a1/5857c77600a28813aaf0f66d4e4568f50c9f133277a4122411 
    Running setup.py bdist_wheel for lxml ... done 
    Stored in directory: /home/scrapyuser/.cache/pip/wheels/6c/eb/a1/e4ff54c99630e3cc6ec659287c4fd88345cd78199923544412 
    Running setup.py bdist_wheel for Twisted ... done 
    Stored in directory: /home/scrapyuser/.cache/pip/wheels/fe/9d/3f/9f7b1c768889796c01929abb7cdfa2a9cdd32bae64eb7aa239 
    Running setup.py bdist_wheel for cssselect ... done 
    Stored in directory: /home/scrapyuser/.cache/pip/wheels/1b/41/70/480fa9516ccc4853a474faf7a9fb3638338fc99a9255456dd0 
    Running setup.py bdist_wheel for cryptography ... done 
    Stored in directory: /home/scrapyuser/.cache/pip/wheels/f6/6c/21/11ec069285a52d7fa8c735be5fc2edfb8b24012c0f78f93d20 
    Running setup.py bdist_wheel for zope.interface ... done 
    Stored in directory: /home/scrapyuser/.cache/pip/wheels/52/04/ad/12c971c57ca6ee5e6d77019c7a1b93105b1460d8c2db6e4ef1 
    Running setup.py bdist_wheel for cffi ... done 
    Stored in directory: /home/scrapyuser/.cache/pip/wheels/8f/00/29/553c1b1db38bbeec3fec428ae4e400cd8349ecd99fe86edea1 
    Running setup.py bdist_wheel for pycparser ... done 
    Stored in directory: /home/scrapyuser/.cache/pip/wheels/9b/f4/2e/d03e949a551719a1ffcb659f2c63d8444f4df12e994ce52112 
Successfully built PyDispatcher lxml Twisted cssselect cryptography zope.interface cffi pycparser 
Installing collected packages: PyDispatcher, idna, pyasn1, six, pycparser, cffi, cryptography, pyOpenSSL, lxml, w3lib, cssselect, parsel, zope.interface, Twisted, queuelib, attrs, pyasn1-modules, service-identity, scrapy 
Successfully installed PyDispatcher-2.0.5 Twisted-16.2.0 attrs-16.0.0 cffi-1.6.0 cryptography-1.4 cssselect-0.9.1 idna-2.1 lxml-3.6.0 parsel-1.0.2 pyOpenSSL-16.0.0 pyasn1-0.1.9 pyasn1-modules-0.0.8 pycparser-2.14 queuelib-1.4.2 scrapy-1.1.0 service-identity-16.0.0 six-1.10.0 w3lib-1.14.2 zope.interface-4.1.3 

的事最后测试示例项目:

(scrapy11.py3) [email protected]:~$ scrapy startproject tutorial 
New Scrapy project 'tutorial', using template directory '/home/scrapyuser/.virtualenvs/scrapy11.py3/lib/python3.5/site-packages/scrapy/templates/project', created in: 
    /home/scrapyuser/tutorial 

You can start your first spider with: 
    cd tutorial 
    scrapy genspider example example.com 
(scrapy11.py3) [email protected]:~$ cd tutorial 
(scrapy11.py3) [email protected]:~/tutorial$ scrapy genspider example example.com 
Created spider 'example' using template 'basic' in module: 
    tutorial.spiders.example 
(scrapy11.py3) [email protected]:~/tutorial$ cat tutorial/spiders/example.py 
# -*- coding: utf-8 -*- 
import scrapy 


class ExampleSpider(scrapy.Spider): 
    name = "example" 
    allowed_domains = ["example.com"] 
    start_urls = (
     'http://www.example.com/', 
    ) 

    def parse(self, response): 
     pass 
(scrapy11.py3) [email protected]:~/tutorial$ scrapy crawl example 
2016-06-07 11:08:27 [scrapy] INFO: Scrapy 1.1.0 started (bot: tutorial) 
2016-06-07 11:08:27 [scrapy] INFO: Overridden settings: {'SPIDER_MODULES': ['tutorial.spiders'], 'BOT_NAME': 'tutorial', 'ROBOTSTXT_OBEY': True, 'NEWSPIDER_MODULE': 'tutorial.spiders'} 
2016-06-07 11:08:27 [scrapy] INFO: Enabled extensions: 
['scrapy.extensions.logstats.LogStats', 'scrapy.extensions.corestats.CoreStats'] 
2016-06-07 11:08:27 [scrapy] INFO: Enabled downloader middlewares: 
['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware', 
'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware', 
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware', 
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware', 
'scrapy.downloadermiddlewares.retry.RetryMiddleware', 
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware', 
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware', 
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware', 
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware', 
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware', 
'scrapy.downloadermiddlewares.chunked.ChunkedTransferMiddleware', 
'scrapy.downloadermiddlewares.stats.DownloaderStats'] 
2016-06-07 11:08:27 [scrapy] INFO: Enabled spider middlewares: 
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware', 
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware', 
'scrapy.spidermiddlewares.referer.RefererMiddleware', 
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware', 
'scrapy.spidermiddlewares.depth.DepthMiddleware'] 
2016-06-07 11:08:27 [scrapy] INFO: Enabled item pipelines: 
[] 
2016-06-07 11:08:27 [scrapy] INFO: Spider opened 
2016-06-07 11:08:28 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min) 
2016-06-07 11:08:28 [scrapy] DEBUG: Crawled (404) <GET http://www.example.com/robots.txt> (referer: None) 
2016-06-07 11:08:28 [scrapy] DEBUG: Crawled (200) <GET http://www.example.com/> (referer: None) 
2016-06-07 11:08:28 [scrapy] INFO: Closing spider (finished) 
2016-06-07 11:08:28 [scrapy] INFO: Dumping Scrapy stats: 
{'downloader/request_bytes': 436, 
'downloader/request_count': 2, 
'downloader/request_method_count/GET': 2, 
'downloader/response_bytes': 1921, 
'downloader/response_count': 2, 
'downloader/response_status_count/200': 1, 
'downloader/response_status_count/404': 1, 
'finish_reason': 'finished', 
'finish_time': datetime.datetime(2016, 6, 7, 11, 8, 28, 614605), 
'log_count/DEBUG': 2, 
'log_count/INFO': 7, 
'response_received_count': 2, 
'scheduler/dequeued': 1, 
'scheduler/dequeued/memory': 1, 
'scheduler/enqueued': 1, 
'scheduler/enqueued/memory': 1, 
'start_time': datetime.datetime(2016, 6, 7, 11, 8, 28, 24624)} 
2016-06-07 11:08:28 [scrapy] INFO: Spider closed (finished) 
(scrapy11.py3) [email protected]:~/tutorial$ 
+0

谢谢保罗。我遵循你的步骤(但不使用docker和virtualenv),并且安装成功。但是,我的默认python变成2.7.11,默认情况下是3.5.1。可以解决这个问题? – Harrison

+0

我确实收到了'成功安装... scrapy ...'消息,但是当我运行scrapy startproject myProject时,我收到一条错误消息,说'程序'scrapy'目前没有安装。您可以通过键入以下内容来安装它: sudo apt install python-scrapy' – Harrison

+0

第二个问题是由'sudo pip install scrapy'解决的# – Harrison