我如何请求（获取）并使用python读取xml文件？

我尝试使用Python请求上的RSS源。在过去，我使用过urllib，或者要求库来达到这个目的，并且它工作的很好。但是这一次，我继续获得406 status error，我知道该页面告诉我它不接受请求中的头部详细信息。我尝试改变它，但无济于事。
这就是我试过我如何请求（获取）并使用python读取xml文件？

import requests 
url = 'https://www.treasurydirect.gov/TA_WS/securities/announced/rss' 
user_agent = {'User-agent': 'Mozilla/5.0'} 
response = requests.get(url, headers = user_agent) 
print response.text

环境：的Python 2.7和3.4。我也尝试通过curl访问相同的确切错误。

我认为这是页面特定的，但无法弄清楚如何适当地构建请求来阅读此页面。

我在页面上发现了一个API，我可以在json中读取相同的数据，所以这个问题现在更多的是对我的好奇心，而不是真正的问题。

任何答案将不胜感激！

标题详细

{'surrogate-control': 'content="ESI/1.0",no-store', 'content-language': 'en-US', 'x-content-type-options': 'nosniff', 'x-powered-by': 'Servlet/3.0', 'transfer-encoding': 'chunked', 'set-cookie': 'BIGipServerpl_www.treasurydirect.gov_443=3221581322.47873.0000; path=/; Httponly; Secure, TS01598982=016b0e6f4634928e3e7e689fa438848df043a46cb4aa96f235b0190439b1d07550484963354d8ef442c9a3eb647175602535b52f3823e209341b1cba0236e4845955f0cdcf; Path=/', 'strict-transport-security': 'max-age=31536000; includeSubDomains', 'keep-alive': 'timeout=10, max=100', 'connection': 'Keep-Alive', 'cache-control': 'no-store', 'date': 'Sun, 23 Apr 2017 04:13:00 GMT', 'x-frame-options': 'SAMEORIGIN', '$wsep': '', 'content-type': 'text/html;charset=ISO-8859-1'}

来源

2017-04-23 Dom

您需要添加accept到页眉请求：

import requests 

url = 'https://www.treasurydirect.gov/TA_WS/securities/announced/rss' 
headers = {'accept': 'application/xml;q=0.9, */*;q=0.8'} 
response = requests.get(url, headers=headers) 

print response.text

来源

2017-04-23 06:30:25 vold

完美。谢谢！ – Dom

不客气！我很高兴能够提供帮助。 – vold

我如何请求（获取）并使用python读取xml文件？

回答

相关问题