2017-06-12 244 views
0

由于在几个星期前,我无法再下载雅虎财经数据:的wget不能下载雅虎财经的数据更

$ wget -O GLD.USA_20170612.txt --no-check-certificate http://chart.finance.yahoo.com/table.csv?s=GLD&a=2&b=1&c=2017&d=11&e=30&f=2017&ignore=.csv 
--2017-06-12 12:21:28-- http://gld.usa_20170612.txt/ 
Resolving gld.usa_20170612.txt (gld.usa_20170612.txt)... failed: No address associated with hostname. 
wget: unable to resolve host address ‘gld.usa_20170612.txt’ 
--2017-06-12 12:21:28-- http://chart.finance.yahoo.com/table.csv?s=GLD&a=2&b=1&c=2017&d=11&e=30&f=2017 
Resolving chart.finance.yahoo.com (chart.finance.yahoo.com)... 87.248.114.12, 87.248.116.11, 87.248.116.12, ... 
Connecting to chart.finance.yahoo.com (chart.finance.yahoo.com)|87.248.114.12|:80... connected. 
HTTP request sent, awaiting response... 301 Moved Permanently 
Location: https://chart.finance.yahoo.com/table.csv?s=GLD&a=2&b=1&c=2017&d=11&e=30&f=2017 [following] 
--2017-06-12 12:21:28-- https://chart.finance.yahoo.com/table.csv?s=GLD&a=2&b=1&c=2017&d=11&e=30&f=2017 
Connecting to chart.finance.yahoo.com (chart.finance.yahoo.com)|87.248.114.12|:443... connected. 
HTTP request sent, awaiting response... 404 Not Found 
2017-06-12 12:21:28 ERROR 404: Not Found. 

他们已经改变了一些东西。我从网站上拿起一个新的URL,但它仍然不能正常工作:

$ wget https://query1.finance.yahoo.com/v7/finance/download/GLD?period1=1494584558&period2=1497262958&interval=1d&events=history&crumb=GGHpj6ucgIy 
--2017-06-12 12:27:24-- https://query1.finance.yahoo.com/v7/finance/download/GLD?period1=1494584558 
Resolving query1.finance.yahoo.com (query1.finance.yahoo.com)... 87.248.116.11, 87.248.116.12, 87.248.114.11, ... 
Connecting to query1.finance.yahoo.com (query1.finance.yahoo.com)|87.248.116.11|:443... connected. 
HTTP request sent, awaiting response... 401 Unauthorized 

Username/Password Authentication Failed. 

我不明白为什么它提到了用户名和密码。当我点击https://uk.finance.yahoo.com/quote/GLD/history?p=GLD上的下载按钮时,它不会要求输入用户名和密码。所以似乎可以在没有订阅的情况下下载数据。

如果有人知道下载雅虎财务数据的正确wget实现,请在此分享。

...

更新:

由于当前的答复,我暗示, “曲奇” 可能是这里涉及。当搜索与关键字饼干类似的问题,我发现了以下主题:

Yahoo Finance Historical data downloader url is not working

Yahoo Finance URL not working

不幸的是,这是对我来说太复杂...... 我希望在一点点帮助使这项工作。

+0

_“我不不明白为什么它会提到用户名和密码。当我点击https://uk.finance.yahoo.com/quote/GLD/histo上的“下载”按钮时ry?p = GLD,它没有要求输入用户名和密码“_ - 那么我建议你在_private_浏览器标签中打开该下载链接进行更改,并找到解释... – CBroe

+0

你错过了一个cookie,你可以检查你点击下载按钮时发送的cookies。我怀疑雅虎不希望您在不使用自己的API的情况下访问他们的财务数据。当你这样做时,你正在绕过他们的广告。 – matt

+0

如果我可以识别哪个cookie,并将其保存为离线状态...如何指示'wget'使用cookie? –

回答

0

为了应对雅虎cokies我写了下面的代码:

#!/usr/bin/sh 

symbol=$1 
today=$(date +%Y%m%d) 
tomorrow=$(date --date='1 days' +%Y%m%d) 

first_date=$(date -d "$2" '+%s') 
last_date=$(date -d "$today" '+%s') 

wget --no-check-certificate --save-cookies=cookie.txt https://finance.yahoo.com/quote/$symbol/?p=$symbol -O crumb.store 

crumb=$(grep 'root.*App' crumb.store | sed 's/,/\n/g' | grep CrumbStore | sed 's/"CrumbStore":{"crumb":"\(.*\)"}/\1/') 

wget --no-check-certificate --load-cookies=cookie.txt "https://query1.finance.yahoo.com/v7/finance/download/$symbol?period1=$first_date&period2=$last_date&interval=1d&events=history&crumb=$crumb" -O $symbol.csv 

rm cookie.txt crumb.store 

用例:

$ sh stockdownload_yahoo.sh QQQ 20170508 

实例支持的日期格式:

$ date -d 20170328 +%s 
1490652000 
$ date -d 2017-03-28 +%s 
1490652000 
$ date -d "Mar 28 2017" +%s 
1490652000