2014-11-06 84 views
2

我正试图用雅虎金融与美丽的汤羹在Python中刮去道琼斯股票指数。用Python篡改道琼斯指数的雅虎财经

这是我曾尝试:

from bs4 import BeautifulSoup 

myurl = "http://finance.yahoo.com/q/cp?s=^DJI" 
soup = BeautifulSoup(html) 

for item in soup: 
    date = row.find('span', 'time_rtq_ticker').text.strip() 
    print date 

下面是来自谷歌的镀铬元素检查: enter image description here

如何,我只刮去跨度标签17,555.47多少?

+1

你想学习如何使用BeautifulSoup刮,还是你只是感兴趣的数据?如果你需要这些数据,我相信他们的[api](https://code.google.com/p/yahoo-finance-managed/wiki/YahooFinanceAPIs)可能是一个更好的来源。 – nerdwaller 2014-11-06 23:01:22

+0

我只是简单地想从雅虎财经刮这一个数字。欢呼 – 2014-11-06 23:03:24

+0

由于您没有声明变量'html',因此您呼叫“soup = BeautifulSoup(html)”可能会返回一个错误。它应该读取'myurl'而不是'html'吗? – thefragileomen 2014-11-06 23:06:22

回答

3

只需使用find,确实很容易,就像这样:

from bs4 import BeautifulSoup 
import requests 

myurl = "http://finance.yahoo.com/q/cp?s=^DJI" 
# I use requests to get the html content 
html = requests.get(myurl).content 
soup = BeautifulSoup(html) 

# you don't need to iterate the children, just use find 
# and you need to use attrs { key: value }, not just 'time_rtq_ticker' 
soup.find('span', attrs={'class':'time_rtq_ticker'}).text 
u'17,554.47' 
+0

正是我在找什么。非常感谢! – 2014-11-06 23:12:47

+0

不是问题:) – Anzel 2014-11-06 23:13:21