2013-04-23 83 views
2

如何在Pyparsing中以编程方式提取语法规则匹配的源范围(开始和结束位置)?对于这个(子)规则,我不能使用setParseAction,因为我正在检查另一个回调中依次指定为ParseAction的分析树内容。我也缺少打印功能,人性化类似于pprint的方式,由parseString()返回的内容。我知道toList(),但我不确定此成员是否剥夺了有趣的信息,如上下文。Pyparsing令牌源范围

回答

3

这里是展示如何捕获解析表达式的位置,并使用dump(),列出分析数据,并命名为结果一些示例代码:

from pyparsing import * 

# use an Empty to define a null token that just records its 
# own location in the input string 
locnMarker = Empty().leaveWhitespace().setParseAction(lambda s,l,t: l) 

# define a example expression and save the start and end locations 
markedInteger = locnMarker + Word(nums)('value') + locnMarker 

# assign named results for the start and end values, 
# and pop them out of the actual list of tokens 
def markStartAndEnd(s,l,t): 
    t['start'],t['end'] = t.pop(0),t.pop(-1) 
markedInteger.setParseAction(markStartAndEnd) 

# find all integers in the source string, and print 
# their value, start, and end locations; use dump() 
# to show the parsed tokens and any named results 
source = "ljsdlj2342 sadlsfj132 sldfj12321 sldkjfsldj 1232" 
for integer in markedInteger.searchString(source): 
    print integer.dump() 

打印:

['2342'] 
- end: 11 
- start: 6 
- value: 2342 
['132'] 
- end: 22 
- start: 18 
- value: 132 
['12321'] 
- end: 33 
- start: 27 
- value: 12321 
['1232'] 
- end: 48 
- start: 44 
- value: 1232 
+0

此功能已被纳入pyparsing,作为助手函数'locatedExpr'。在这个例子中,'markedInteger'将被替换为'locatedExpr(Word(nums))',并且不需要添加任何分析动作。 – PaulMcG 2014-11-20 22:50:06

+0

如果输入文本包含个字符,还要格外小心。默认情况下,pyparsing在开始解析之前会在输入字符串中调用'string.expandtabs()',这将改变输入内字符串的预期位置。要抑制这种默认行为,请在调用expr.parseString(inputStringContainingTabs)之前调用'expr.parseWithTabs()'。 – PaulMcG 2014-11-20 22:54:10