2014-10-17 148 views
2

我试图从通过Nokigiri一个plist中的iTunes资料库加载出口歌曲名称:的plist XPath查询与字典元素

DOC =引入nokogiri :: XML(开放(file.path))

plist中的开头是这样的:

<plist version="1.0"> 
<dict> 
    <key>Major Version</key><integer>1</integer> 
    <key>Minor Version</key><integer>1</integer> 
    <key>Date</key><date>2014-10-15T22:52:19Z</date> 
    <key>Application Version</key><string>11.4</string> 
    <key>Features</key><integer>5</integer> 
    <key>Show Content Ratings</key><true/> 
    <key>Music Folder</key><string>file://localhost/Users/mike/Music/iTunes/iTunes%20Media/</string> 
    <key>Library Persistent ID</key><string>280B84572DDCF406</string> 
    <key>Tracks</key> 
    <dict> 
    <key>96</key> 
    <dict> 
     <key>Track ID</key><integer>96</integer> 
     <key>Name</key><string>Get Lucky (Daft Punk cover)</string> 
     <key>Artist</key><string>Daughter</string> 
     <key>Kind</key><string>MPEG audio file</string> 
     <key>Size</key><integer>4716638</integer> 
     <key>Total Time</key><integer>294112</integer> 
     <key>Date Modified</key><date>2013-11-12T20:54:14Z</date> 
     <key>Date Added</key><date>2013-12-18T17:56:09Z</date> 
     <key>Bit Rate</key><integer>128</integer> 
     <key>Sample Rate</key><integer>44100</integer> 
     <key>Persistent ID</key><string>C3B1B6F26134C9C1</string> 
     <key>Track Type</key><string>File</string> 
     <key>Location</key><string>file://localhost/Users/mike/Music/iTunes/iTunes%20Media/Music/Daughter/Unknown%20Album/Get%20Lucky%20(Daft%20Punk%20cover).mp3</string> 
     <key>File Folder Count</key><integer>5</integer> 
     <key>Library Folder Count</key><integer>1</integer> 
    </dict> 
    <key>98</key> 
    <dict> 
     <key>Track ID</key><integer>98</integer> 
     <key>Name</key><string>Swimming in Solace (DJ Fergie Ferg Remash)</string> 
     <key>Kind</key><string>MPEG audio file</string> 

我寻找从每个轨道加载是名称键后到来的曲目名称的字符串。我认为应该工作的XPath是

/plist/dict[key[. = 'Tracks']/following-sibling::*[1]]/dict[key/following-sibling::*[1]]/dict[key[. = 'Name']/following-sibling::*[1]]/string 

那XPath返回:

<string>Get Lucky (Daft Punk cover)</string> 
<string>Daughter</string> 
<string>MPEG audio file</string> 
<string>C3B1B6F26134C9C1</string> 
<string>File</string> 
<string>file://localhost/Users/mike/Music/iTunes/iTunes%20Media/Music/Daughter/Unknown%20Album/Get%20Lucky%20(Daft%20Punk%20cover).mp3</string> 
<string>Swimming in Solace (DJ Fergie Ferg Remash)</string> 
<string>MPEG audio file</string> 

看来,虽然我的XPath是确定各串钥匙,它实际上是在走“以下事项兄弟姐妹'的每一个字母的无论。

我能做些什么,以使查询更加具体,使plist中的这一部分将返回:

Get Lucky (Daft Punk cover) 

Swimming in Solace (DJ Fergie Ferg Remash) 

回答

2

这是一个可能的XPath:

/plist/dict[key='Tracks']/dict/dict/key[.='Name']/following-sibling::string[1] 

XPath的开始可能会有所不同,但我认为最重要的部分是最后2个路径步骤(key[.='Name']/following-sibling::string[1])。它告诉在每个<key>Name</key>元素之后得到最接近的<string>元素。

+0

非常感谢!我以为我会疯狂尝试不同的排列。 就像一个注释,添加'/ node()'到你提供的xpath的末尾,也会删除''标签,只是获取字符串值。 – muzicmike 2014-10-17 17:11:25

0

我会做这样的事情:

require 'nokogiri' 

doc = Nokogiri::XML(<<EOT) 
    <plist version="1.0"> 
    <dict> 
     <key>Major Version</key><integer>1</integer> 
     <key>Minor Version</key><integer>1</integer> 
     <key>Date</key><date>2014-10-15T22:52:19Z</date> 
     <key>Application Version</key><string>11.4</string> 
     <key>Features</key><integer>5</integer> 
     <key>Show Content Ratings</key><true/> 
     <key>Music Folder</key><string>file://localhost/Users/mike/Music/iTunes/iTunes%20Media/</string> 
     <key>Library Persistent ID</key><string>280B84572DDCF406</string> 
     <key>Tracks</key> 
     <dict> 
     <key>96</key> 
     <dict> 
      <key>Track ID</key><integer>96</integer> 
      <key>Name</key><string>Get Lucky (Daft Punk cover)</string> 
      <key>Artist</key><string>Daughter</string> 
      <key>Kind</key><string>MPEG audio file</string> 
      <key>Size</key><integer>4716638</integer> 
      <key>Total Time</key><integer>294112</integer> 
      <key>Date Modified</key><date>2013-11-12T20:54:14Z</date> 
      <key>Date Added</key><date>2013-12-18T17:56:09Z</date> 
      <key>Bit Rate</key><integer>128</integer> 
      <key>Sample Rate</key><integer>44100</integer> 
      <key>Persistent ID</key><string>C3B1B6F26134C9C1</string> 
      <key>Track Type</key><string>File</string> 
      <key>Location</key><string>file://localhost/Users/mike/Music/iTunes/iTunes%20Media/Music/Daughter/Unknown%20Album/Get%20Lucky%20(Daft%20Punk%20cover).mp3</string> 
      <key>File Folder Count</key><integer>5</integer> 
      <key>Library Folder Count</key><integer>1</integer> 
     </dict> 
     <key>98</key> 
     <dict> 
      <key>Track ID</key><integer>98</integer> 
      <key>Name</key><string>Swimming in Solace (DJ Fergie Ferg Remash)</string> 
      <key>Kind</key><string>MPEG audio file</string> 
EOT 

使用这个配置,代码:

doc.search('dict dict dict').map{ |d| d.at('./key[2]').next_sibling.text } 
# => ["Get Lucky (Daft Punk cover)", 
#  "Swimming in Solace (DJ Fergie Ferg Remash)"] 

我更喜欢使用CSS选择在可能的情况,并引入nokogiri不关心无论我们使用它们还是XPath对XML内容,因此使用search('dict dict dict')。然后,XPath可以方便地抓取第n个元素,这导致使用at('./key[2]')来抓取<key>节点。然后next_sibling返回下一个节点。

它可以在纯XPath中完成,但我发现看起来像线噪声,并且更喜欢这种混合方法。纯XPath可能运行得更快,但我可以更快地保持我的方式。