基于字长的插入

我有如此描述方面具有巨大的XML文件：

<term> 
<termId>MANUAL000399</termId> 
<termUpdate>Add</termUpdate> 
<termName>care</termName> 
<termType>Pt</termType> 
<termStatus>Active</termStatus> 
<termApproval>Approved</termApproval> 
<termCreatedDate>20120618T14:38:20</termCreatedDate> 
<termCreatedBy>admin</termCreatedBy> 
<termModifiedDate>20120618T14:40:41</termModifiedDate> 
<termModifiedBy>admin</termModifiedBy> 
</term>

在该文件中，术语要么

<termType>

Pt或ND

我想解决申请二者皆是。什么，我想要做的是能穿过去，看看字长 termName ，如果里面还有超过5个字符，添加另一个属性，一个

<termNote>

在

后

<termModifiedBy>

属性：

<term> 
<termId>MANUAL000399</termId> 
<termUpdate>Add</termUpdate> 
<termName>care</termName> 
<termType>Pt</termType> 
<termStatus>Active</termStatus> 
<termApproval>Approved</termApproval> 
<termCreatedDate>20120618T14:38:20</termCreatedDate> 
<termCreatedBy>admin</termCreatedBy> 
<termModifiedDate>20120618T14:40:41</termModifiedDate> 
<termModifiedBy>admin</termModifiedBy> 
<termNote label="Short">Short</termNote> 
</term>

谁能指教一下这样做的最好的方法？我在这里发现了正则表达式，但问题在于它们的应用，我发现有人建议/ \ b [a-zA-Z] {5，} \ b /但我不知道如何编写一个脚本，然后插入术语笔记，如果它匹配。

来源

2012-09-11 lobe

很难不提供到这里的链接：http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 –

什么应该我使用而不是正则表达式？正如我所说我不是程序员，也不知道这些事情。谢谢 – lobe

我很抱歉，我不会回答你的问题。但我可以提出一些意见。首先，如果作为一个非程序员，你只需要做你在这里展示的东西，那么你就需要成为一名程序员。选择Python或Ruby并学习它。其次，你的问题并不清楚。你需要改进你的文本构成，我确信那里的XML人会回答。第三，不要用regexen解析XML，除非你有一组特定的，已知的文档，这些文档碰巧能够被正则表达式解析。正则表达式不是金锤子。 –

这个转换可以通过一个简单的XSLT样式表完成。（XSLT是一种非程序员经常比程序员更积极的语言，样式表基本上是一套转换规则：当你看到与X匹配的东西时，将其替换为Y.当然，一旦掌握了XSLT，你就可以可以称自己是程序员）。

一是一些样板：

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> 
<xsl:strip-space elements="*"/> <!-- removes whitespace from the input --> 
<xsl:output indent="yes"/>  <!-- adds whitespace to the output -->

然后默认模板规则，副本的东西不变，如果没有更具体的规则：

<xsl:template match="*"> 
    <xsl:copy> 
    <xsl:copy-of select="@*"/> 
    <xsl:apply-templates/> 
    </xsl:copy> 
</xsl:template>

然后模板规则匹配简要条件：

<xsl:template match="term[string-length(termName) &lt; 5]"> 
    <term> 
    <xsl:copy-of select="*"/> 
    <termNote label="Short">Short</termNote> 
    </term> 
</xsl:template>

然后结束：

</xsl:stylesheet>

你应该能够与任何XSLT处理器上运行，这一点;有很多可用的。如果没有别的想法，请下载我的撒克逊处理器周围非常简单的GUI界面的KernowForSaxon（来自SourceForge）。

来源

2012-09-11 11:30:19

哇，这太棒了，这是完全正确的！我无法告诉你我多么感激，非常感谢你。 – lobe

基于字长的插入

回答

相关问题