我正在编写脚本来做一些探索性分析。该脚本refrences一个API,用于ID的和API返回响应与XML输出(没有子对象)如何在'for'循环中获取函数的输出并使用它构建数据框?
脚本:
import requests
import xml.etree.ElementTree as et
xml ='''
<?xml version="1.0" encoding="UTF-8"?>
<YM>
<Version>xxx</Version>
<ApiKey>xxx</ApiKey>
<CallID>xxx</CallID>
<></>
<SaPasscode>xxxx</SaPasscode>
<Call Method = "GetIDs">
</Call>
</YM>
'''
headers = {'Content-Type': 'application/x-www-form-urlencoded'}
r = requests.post('url', data=xml, headers=headers)
输出示例:
<Members>
<Sa.Members.All.GetIDs>
<YourMembership_Response>
<ID>1234</ID>
<ID>4321</ID>
</Members>
</Sa.Members.All.GetIDs>
</YourMembership_Response>
我把这些ID并将它们插入另一个API调用以获取有关ID的更多信息,在同一脚本中通过迭代函数将来自上述API调用的ID解析为另一个可获取有关每个ID的信息的API调用:
脚本:
def xml_event_info(eventID):
xml ='''
<?xml version="1.0" encoding="UTF-8"?>
<YourMembership>
<Version>xxx</Version>
<ApiKey>xxx</ApiKey>
<CallID>xxx</CallID>
<></>
<SaPasscode>xxx</SaPasscode>
<Call Method = "Profile.Get">
<ID>{}</ID>
</Call>
</YourMembership>
'''
headers = {'Content-Type': 'application/x-www-form-urlencoded'}
r = requests.post('url',
data=xml.format(eventID), headers=headers)
print(r.text)
# BUILD XML TREE OBJECT
dom = et.fromstring(r.text)
# PARSE EVENT ID TEXT AND PASS INTO FUNCTION
for i in dom.iterfind('.//ID'):
xml_event_info(i.text)
实施例输出(有更多的XML对象然后示出):
<?xml version="1.0" encoding="utf-8" ?>
<Response>
<ErrCode>xxx</ErrCode>
<ExtendedErrorInfo>xxx</ExtendedErrorInfo>
<Profile.Get>
<ID>xxxx</ID>
<WebsiteID>xxxx</WebsiteID>
<EmailBounced>xxx</EmailBounced>
<NamePrefix>xxx</NamePrefix>
<FirstName>xxx</FirstName>
</Profile.Get>
</Response>
我想利用与它的许多XML上面的例子中从第二API调用和地图属性他们到一个熊猫数据框。我遇到的问题是,当我尝试使用功能(xml_event_info(i.text))
调用来自内部的for循环在这里发现,保持第二API调用输出:
# PARSE EVENT ID TEXT AND PASS INTO FUNCTION
for i in dom.iterfind('.//ID'):
xml_event_info(i.text)
我试图将XML映射到据帧和我不断收到错误“类型错误:解析()参数1必须是字符串或只读缓冲器,而不是无”
如何可以解析从多个API XML输出调用到大熊猫数据帧,其中每个XML标记是数据帧的标题
Example:
---|ErrCode|ExtendedInfo|ID|FirstName----
脚本和网站我指的是把工作做好在这里找到(http://www.austintaylor.io/lxml/python/pandas/xml/dataframe/2016/07/08/convert-xml-to-pandas-dataframe/)
脚本:
def xml2df():
tree = et.fromstring(xml_event_info(i.text))
root = tree.getroot()
all_records = []
headers = []
for i, child in enumerate(root):
record = []
for subchild in child:
record.append(subchild.text)
if subchild.tag not in headers:
headers.append(subchild.tag)
all_records.append(record)
return pd.DataFrame(all_records, columns=headers)
完整的脚本:
import requests
import xml.etree.ElementTree as et
import pandas as pd
from lxml import etree
xml ='''
<?xml version="1.0" encoding="UTF-8"?>
<YourMembership>
<Version>xxx</Version>
<ApiKey>xxxx</ApiKey>
<CallID>xxx</CallID>
<></>
<SaPasscode>xxx</SaPasscode>
<Call Method = "Events.All.GetIDs">
<StartDate>2017/01/1</StartDate>
<EndDate>2017/01/31</EndDate>
</Call>
</YourMembership>
'''
headers = {'Content-Type': 'application/x-www-form-urlencoded'}
r = requests.post('url', data=xml, headers=headers)
def xml_event_info(eventID):
xml ='''
<?xml version="1.0" encoding="UTF-8"?>
<YourMembership>
<Version>xxx</Version>
<ApiKey>xxx</ApiKey>
<CallID>xxx</CallID>
<></>
<SaPasscode>xxx</SaPasscode>
<Call Method = "Event.Get">
<EventID>{}</EventID>
</Call>
</YourMembership>
'''
headers = {'Content-Type': 'application/x-www-form-urlencoded'}
r = requests.post('url',
data=xml.format(eventID), headers=headers)
print(r.text)
return r.text
# BUILD XML TREE OBJECT
dom = et.fromstring(r.text)
# PARSE EVENT ID TEXT AND PASS INTO FUNCTION
for i in dom.iterfind('.//EventID'):
y = xml_event_info(i.text)
for xml in y:
tree = et.fromstring(y)
root = tree.getchildren()
all_records = []
headers = []
for i , child in enumerate(root):
record = []
for subchild in child:
record.append(subchild.text)
if subchild.tag not in headers:
headers.append(subchild.tag)
all_records.append(record)
#print all_records
print pd.DataFrame(all_records, columns=headers)
编辑:
TLDR:
如何使从下面的函数的输出被映射到与该XML元素作为对数据帧的报头的数据帧:
import requests
import xml.etree.ElementTree as et
import pandas as pd
xml ='''
<?xml version="1.0" encoding="UTF-8"?>
<YourMembership>
<Version>xxx</Version>
<ApiKey>xxxx</ApiKey>
<CallID>xxx</CallID>
<></>
<SaPasscode>xxxx</SaPasscode>
<Call Method = "GetIDs">
</Call>
</YourMembership>
'''
headers = {'Content-Type': 'application/x-www-form-urlencoded'}
r = requests.post('url', data=xml, headers=headers)
def xml_event_info(eventID):
xml ='''
<?xml version="1.0" encoding="UTF-8"?>
<YourMembership>
<Version>xxx</Version>
<ApiKey>xxx</ApiKey>
<CallID>xxx</CallID>
<></>
<SaPasscode>xxx</SaPasscode>
<Call Method = "Profile.Get">
<ID>{}</ID>
</Call>
</YourMembership>
'''
headers = {'Content-Type': 'application/x-www-form-urlencoded'}
r = requests.post('url',
data=xml.format(eventID), headers=headers)
print(r.text)
输出:
<?xml version="1.0" encoding="utf-8" ?>
<Response>
<ErrCode>xxx</ErrCode>
<ExtendedErrorInfo>xxx</ExtendedErrorInfo>
<Profile.Get>
<ID>xxxx</ID>
<WebsiteID>xxxx</WebsiteID>
<EmailBounced>xxx</EmailBounced>
<NamePrefix>xxx</NamePrefix>
<FirstName>xxx</FirstName>
</Profile.Get>
</Response>
IMO,你的问题很详细。你可以给一个tldr;版。我很难理解你想要解决的问题。 – EyuelDK
你错过了[MVE](https://stackoverflow.com/help/mcve) –
@EyuelDK增加了tldr – RustyShackleford