使用urllib和BeautifulSoup从Python中检索信息

我可以使用urllib来获取html页面，并使用BeautifulSoup来解析html页面，并且它看起来像我必须生成要从BeautifulSoup中读取的文件。使用urllib和BeautifulSoup从Python中检索信息

import urllib          
sock = urllib.urlopen("http://SOMEWHERE") 
htmlSource = sock.read()        
sock.close()           
--> write to file

有没有办法在不从urllib生成文件的情况下调用BeautifulSoup？

2010-04-15 prosseek

from BeautifulSoup import BeautifulSoup 

soup = BeautifulSoup(htmlSource)

没有需要的文件写入：只需传递HTML字符串。您也可以直接传递从urlopen返回的对象：

f = urllib.urlopen("http://SOMEWHERE") 
soup = BeautifulSoup(f)

2010-04-15 16:36:10 interjay

回答