我正在用django构建一个刮取网站。出于某种原因,以下代码仅提供一张照片图像,我希望它能打印每张图像,每个链接和每个价格,有什么帮助? (另外,如果你们知道如何将这些数据放到数据库模型中,所以我不必总是刮掉这个网站,我全都耳闻,但这可能是另一个问题)干杯!显示Django模板中的刮取结果
这里是模板文件:
{% extends "base.html" %}
{% block title %}Boats{% endblock %}
{% block content %}
<img src="{{ fetch_boats }}"/>
{% endblock %}
这里是views.py文件:
#views.py
from django.shortcuts import render_to_response
from django.template.loader import get_template
from django.template import Context
from django.http import Http404, HttpResponse
from fetch_images import fetch_imagery
def fetch_it(request):
fi = fetch_imagery()
return render_to_response('fetch_image.html', {'fetch_boats' : fi})
这里是fetch_images模块:
#fetch_images.py
from BeautifulSoup import BeautifulSoup
import re
import urllib2
def fetch_imagery():
response = urllib2.urlopen("http://www.boattrader.com/search-results/Type")
html = response.read()
#create a beautiful soup object
soup = BeautifulSoup(html)
#all boat images have attribute height=165
images = soup.findAll("img",height="165")
for image in images:
return image['src'] #print th url of the image only
# all links to detailed boat information have class lfloat
links = soup.findAll("a", {"class" : "lfloat"})
for link in links:
return link['href']
#print link.string
# all prices are spans and have the class rfloat
prices = soup.findAll("span", { "class" : "rfloat" })
for price in prices:
return price
#print price.string
最后,如果需要的话urlconf中的映射网址如下:
from django.conf.urls.defaults import *
from mysite.views import fetch_it
urlpatterns = patterns('', ('^fetch_image/$', fetch_it))
谢谢RISHABH,我还没有见过(仍然相当新手)...其他任何人屈服声明,这里的产量声明一个伟大的答案是:http:/ /stackoverflow.com/questions/231767/can-somebody-explain-me-the-python-yield-statement – Diego 2010-06-04 11:39:42