我有一个松散结构的XHTML数据,我需要将其转换为更好的结构化XML。一个棘手的XSLT转换
这里的例子:
<tbody>
<tr>
<td class="header"><img src="http://www.abc.com/images/icon_apples.gif"/><img src="http://www.abc.com/images/flag/portugal.gif" alt="Portugal"/> First Grade</td>
</tr>
<tr>
<td>Green</td>
<td>Round shaped</td>
<td>Tasty</td>
</tr>
<tr>
<td>Red</td>
<td>Round shaped</td>
<td>Bitter</td>
</tr>
<tr>
<td>Pink</td>
<td>Round shaped</td>
<td>Tasty</td>
</tr>
<tr>
<td class="header"><img src="http://www.abc.com/images/icon_strawberries.gif"/><img src="http://www.abc.com/images/flag/usa.gif" alt="USA"/> Fifth Grade</td>
</tr>
<tr>
<td>Red</td>
<td>Heart shaped</td>
<td>Super tasty</td>
</tr>
<tr>
<td class="header"><img src="http://www.abc.com/images/icon_bananas.gif"/><img src="http://www.abc.com/images/flag/congo.gif" alt="Congo"/> Third Grade</td>
</tr>
<tr>
<td>Yellow</td>
<td>Smile shaped</td>
<td>Fairly tasty</td>
</tr>
<tr>
<td>Brown</td>
<td>Smile shaped</td>
<td>Too sweet</td>
</tr>
我想实现以下结构:
<data>
<entry>
<type>Apples</type>
<country>Portugal</country>
<rank>First Grade</rank>
<color>Green</color>
<shape>Round shaped</shape>
<taste>Tasty</taste>
</entry>
<entry>
<type>Apples</type>
<country>Portugal</country>
<rank>First Grade</rank>
<color>Red</color>
<shape>Round shaped</shape>
<taste>Bitter</taste>
</entry>
<entry>
<type>Apples</type>
<country>Portugal</country>
<rank>First Grade</rank>
<color>Pink</color>
<shape>Round shaped</shape>
<taste>Tasty</taste>
</entry>
<entry>
<type>Strawberries</type>
<country>USA</country>
<rank>Fifth Grade</rank>
<color>Red</color>
<shape>Heart shaped</shape>
<taste>Super</taste>
</entry>
<entry>
<type>Bananas</type>
<country>Congo</country>
<rank>Third Grade</rank>
<color>Yellow</color>
<shape>Smile shaped</shape>
<taste>Fairly tasty</taste>
</entry>
<entry>
<type>Bananas</type>
<country>Congo</country>
<rank>Third Grade</rank>
<color>Brown</color>
<shape>Smile shaped</shape>
<taste>Too sweet</taste>
</entry>
</data>
首先,我需要提取从TBODY/TR/TD水果型/ img [1]/@ src,其次来自的国家tbody/tr/td/img [2]/@ alt属性和fina lly从tbody/tr/td本身的等级。
接下来,我需要填充每个类别下的所有条目,同时包括这些值(如上所示)。
但是......正如你所看到的,我给出的数据结构非常松散。一个类别只是一个td,然后就是该类别中的所有项目。更糟糕的是,在我的数据集中,每个类别下的项目数量在1到100之间变化...
我试过几种方法,但似乎无法得到它。任何帮助是极大的赞赏。我知道XSLT 2.0引入了xsl:for-each-group,但我仅限于XSLT 1.0。
+1对于一个很好的答案。 –