我使用python和ElementTree访问从EDGAR中抓取的.xml文件列表。我已阅读并重新阅读ElementTree/python.org页面,但我仍不理解如何深入了解数据。我怎么用ElementTree的要达到这样的第一TextBlock中所列出的.xmls使用python和正则表达式解析xbrl以查找TextBlocks
import import re
from urllib2 import urlopen
import requests
import xml.etree.ElementTree as ET
full_xml =['https://www.sec.gov/Archives/edgar/data/1593001/000121390017010242/ngtf-20170630.xml', 'https://www.sec.gov/Archives/edgar/data/13573/000143774917016692/bwla-20170702.xml', 'https://www.sec.gov/Archives/edgar/data/1652871/000165287117000030/none-20170630.xml', 'https://www.sec.gov/Archives/edgar/data/1434674/000154972717000042/chnd-20170630_cal.xml', 'https://www.sec.gov/Archives/edgar/data/1083922/000130841117000030/arao-20170331.xml']
for xml in full_xml:
file = urllib2.urlopen(xml)
tree = ET.parse(file)
root = tree.getroot()
print root
吉谢兰感谢您的深入响应。你会碰巧拥有一台最喜欢的XBRL处理器吗?或者推荐一个易于访问的开源软件? –