UnicodeEncodeError：'ascii'编解码器无法编码字符u'\ u2730'在位置1：序号不在范围内（128）

任何想法如何解决这个问题？UnicodeEncodeError：'ascii'编解码器无法编码字符u' u2730'在位置1：序号不在范围内（128）

import csv 
import re 
import time 
import urllib2 
from urlparse import urljoin 
from bs4 import BeautifulSoup 

BASE_URL = 'http://omaha.craigslist.org/sys/' 
URL = 'http://omaha.craigslist.org/sya/' 
FILENAME = '/Users/mona/python/craigstvs.txt' 

opener = urllib2.build_opener() 
opener.addheaders = [('User-agent', 'Mozilla/5.0')] 
soup = BeautifulSoup(opener.open(URL)) 

with open(FILENAME, 'a') as f: 
    writer = csv.writer(f, delimiter=';') 
    for link in soup.find_all('a', class_=re.compile("hdrlnk")): 
     timeset = time.strftime("%m-%d %H:%M") 

     item_url = urljoin(BASE_URL, link['href']) 
     item_soup = BeautifulSoup(opener.open(item_url)) 

     # do smth with the item_soup? or why did you need to follow this link? 

     writer.writerow([timeset, link.text, item_url])

来源

2014-09-06 Mona Jalal

作为一个经验，我不得不说，CSV模块不支持Unicode完全，但你会发现这种方式非常有用

import codecs 
... 
codecs.open('file.csv', 'r', 'UTF-8')

打开文件，或者可能要自己处理，而不是使用CSV模块

来源

2014-09-06 10:18:19 mehdy

你只需要encode文本：

link.text.encode("utf-8")

也可以使用requests代替urllib2：

import requests 
BASE_URL = 'http://omaha.craigslist.org/sys/' 
URL = 'http://omaha.craigslist.org/sya/' 
FILENAME = 'craigstvs.txt' 
soup = BeautifulSoup(requests.get(URL).content) 
with open(FILENAME, 'a') as f: 
    writer = csv.writer(f, delimiter=';') 
    for link in soup.find_all('a', class_=re.compile("hdrlnk")): 
     timeset = time.strftime("%m-%d %H:%M") 
     item_url = urljoin(BASE_URL, link['href']) 
     item_soup = BeautifulSoup(requests.get(item_url).content) 
     # do smth with the item_soup? or why did you need to follow this link? 
     writer.writerow([timeset, link.text.encode("utf-8"), item_url])

来源

2014-09-06 10:30:13

UnicodeEncodeError：'ascii'编解码器无法编码字符u'\ u2730'在位置1：序号不在范围内（128）

回答

相关问题