我试图解析这个示例XML文件:如何使用循环解析Nokogiri css选择器的XML?
<Collection version="2.0" id="74j5hc4je3b9">
<Name>A Funfair in Bangkok</Name>
<PermaLink>Funfair in Bangkok</PermaLink>
<PermaLinkIsName>True</PermaLinkIsName>
<Description>A small funfair near On Nut in Bangkok.</Description>
<Date>2009-08-03T00:00:00</Date>
<IsHidden>False</IsHidden>
<Items>
<Item filename="AGC_1998.jpg">
<Title>Funfair in Bangkok</Title>
<Caption>A small funfair near On Nut in Bangkok.</Caption>
<Authors>Anthony Bouch</Authors>
<Copyright>Copyright © Anthony Bouch</Copyright>
<CreatedDate>2009-08-07T19:22:08</CreatedDate>
<Keywords>
<Keyword>Funfair</Keyword>
<Keyword>Bangkok</Keyword>
<Keyword>Thailand</Keyword>
</Keywords>
<ThumbnailSize width="133" height="200" />
<PreviewSize width="532" height="800" />
<OriginalSize width="2279" height="3425" />
</Item>
<Item filename="AGC_1164.jpg" iscover="True">
<Title>Bumper Cars at a Funfair in Bangkok</Title>
<Caption>Bumper cars at a small funfair near On Nut in Bangkok.</Caption>
<Authors>Anthony Bouch</Authors>
<Copyright>Copyright © Anthony Bouch</Copyright>
<CreatedDate>2009-08-03T22:08:24</CreatedDate>
<Keywords>
<Keyword>Bumper Cars</Keyword>
<Keyword>Funfair</Keyword>
<Keyword>Bangkok</Keyword>
<Keyword>Thailand</Keyword>
</Keywords>
<ThumbnailSize width="200" height="133" />
<PreviewSize width="800" height="532" />
<OriginalSize width="3725" height="2479" />
</Item>
</Items>
</Collection>
这里是我当前的代码:
require 'nokogiri'
doc = Nokogiri::XML(File.open("sample.xml"))
somevar = doc.css("collection")
#create loop
somevar.each do |item|
puts "Item "
puts item['Title']
puts "\n"
end#items
在XML文档的根开始,我试图从根本上走“收藏”下降到每个新的水平。
我从节点集开始,并从节点获取信息,节点包含元素。如何将节点分配给变量,并提取该文本下面的每个图层?
我可以做类似下面的代码,但我想知道如何系统地移动XML的每个嵌套元素使用循环,并输出每行的数据。完成显示文本后,如何返回到上一个元素/节点,无论它是什么(遍历树中的节点)?
puts somevar.css("Keyworks Keyword").text
那么当你解析XML时你想捕获什么?解析它并遍历它很好,但我们需要知道你实际上想要完成什么。 –
结帐这个sax解析选项,http://amolnpujari.wordpress.com/2012/03/31/reading_huge_xml-rb/新的OX ruby解析器似乎比Nokogiri快5倍,https://gist.github.com/ amolpujari/5966431 –