2015-08-14 37 views
0

我想一个CSV文件,了解如何抓取与不同列爬行输出与colums

import requests 
from bs4 import BeautifulSoup 
import csv 

user_agent = {'User-agent': 'Chrome/43.0.2357.124'} 

output_file= open("City.csv", "w") 

r = requests.get("http://www.bla/paris/") 
soup = BeautifulSoup(r.content) 

g_data = soup.find_all("div", {"class": "itemsContent clearafter"}) 
for item in g_data: 
    Header = item.find_all("div", {"class": "InnprodInfos"}) 
    Header_final = (Header[0].contents[0].text.strip()) 
    price = item.find_all("div", {"class": "prodPrice"}) 
    Price_final = (price[0].contents[0].text.strip()) 
    Deeplink = item.find_all("a") 
    for t in Deeplink: 
     Deeplink_final = (t.get("href")) 

    print("Header: " + Header_final + " | " + "Price: " + Price_final + " | " + "Deeplink: " + Deeplink_final) 
    output_file.write("Header: " + Header_final + " | " + "Price: " + Price_final + " | " + "Deeplink: " + Deeplink_final + "\n") 

I'm能够把我的数据为csv文件,但一个CSV文件导出我的结果不知道如何为它创建3个专用列。 “Header:”+ Header_final应该是第一列。 “Price:”+ Price_final第二个。和“Deeplink:”+ Deeplink_final我最后一个。

你们能帮我吗?

回答

0

只需使用csv模块。您可以导入它,但不要使用它。你可以在那里找到文件。

0

之前的for循环添加以下创建CSV作家,写的标题行:在循环体

writer = csv.writer(output_file) 
csv_fields = ['Header', 'Price', 'Deeplink'] 
if gdata: 
    writer.writerow(csv_fields) 

接着,以该更换你写的语句:

writer.writerow([Header_final, Price_final, Deeplink_final]) 
+0

非常感谢您的反馈。欣赏它:)现在工作 –