让我们说,我有这样的XML文件:如何使用xml数据进行矢量化?
<?xml version="1.0" encoding="UTF-8" ?>
<TimeSeries>
<timeZone>1.0</timeZone>
<series>
<header/>
<event date="2009-09-30" time="10:00:00" value="0.0" flag="2"></event>
<event date="2009-09-30" time="10:15:00" value="0.0" flag="2"></event>
<event date="2009-09-30" time="10:30:00" value="0.0" flag="2"></event>
<event date="2009-09-30" time="10:45:00" value="0.0" flag="2"></event>
<event date="2009-09-30" time="11:00:00" value="0.0" flag="2"></event>
<event date="2009-09-30" time="11:15:00" value="0.0" flag="2"></event>
</series>
<series>
<header/>
<event date="2009-09-30" time="08:00:00" value="1.0" flag="2"></event>
<event date="2009-09-30" time="08:15:00" value="2.6" flag="2"></event>
<event date="2009-09-30" time="09:00:00" value="6.3" flag="2"></event>
<event date="2009-09-30" time="09:15:00" value="4.4" flag="2"></event>
<event date="2009-09-30" time="09:30:00" value="3.9" flag="2"></event>
<event date="2009-09-30" time="09:45:00" value="2.0" flag="2"></event>
<event date="2009-09-30" time="10:00:00" value="1.7" flag="2"></event>
<event date="2009-09-30" time="10:15:00" value="2.3" flag="2"></event>
<event date="2009-09-30" time="10:30:00" value="2.0" flag="2"></event>
</series>
<series>
<header/>
<event date="2009-09-30" time="10:00:00" value="0.0" flag="2"></event>
<event date="2009-09-30" time="10:15:00" value="0.0" flag="2"></event>
<event date="2009-09-30" time="10:30:00" value="0.0" flag="2"></event>
<event date="2009-09-30" time="10:45:00" value="0.0" flag="2"></event>
<event date="2009-09-30" time="11:00:00" value="0.0" flag="2"></event>
</series>
</TimeSeries>
,让我们说我想要做的事与它的一系列元素,而且我想付诸实践的意见“向量化向量化” ......我导入XML库并执行以下操作:
R> library("XML")
R> doc <- xmlTreeParse('/home/mario/Desktop/sample.xml')
R> TimeSeriesNode <- xmlRoot(doc)
R> seriesNodes <- xmlElementsByTagName(TimeSeriesNode, "series")
R> length(seriesNodes)
[1] 3
R> (function(x){length(xmlElementsByTagName(x[['series']], 'event'))}
+)(seriesNodes)
[1] 6
R>
,我不明白为什么我应该只得到应用功能的第一个元素的结果是:我所预料的三个值,就像seriesNodes的长度,像这样:
R> mapply(length, seriesNodes)
series series series
7 10 6
哎呀!我已经想出了答案:“用mapply
”:
R> mapply(function(x){length(xmlElementsByTagName(x, 'event'))}, seriesNodes)
series series series
6 9 5
但后来我看到下面的问题:R-魔族告诉我,我是‘循环隐藏’,而不是‘矢量化’!我可以避免循环吗? ...
来自xpathSApply的帮助也特别有启发性(我仍然使用XML包!)。 – mariotomo 2009-12-03 07:51:44