尝试Treetop。描述语法的是类似Ruby的DSL。解析你给出的字符串应该很容易,通过使用真正的解析器,你可以很容易地在以后扩展你的语法。
一个例子语法为要解析串的类型(保存为sentences.treetop
):
grammar Sentences
rule sentence
# A sentence is a combination of one or more expressions.
expression* <Sentence>
end
rule expression
# An expression is either a literal or a parenthesised expression.
parenthesised/literal
end
rule parenthesised
# A parenthesised expression contains one or more sentences.
"(" (multiple/sentence) ")" <Parenthesised>
end
rule multiple
# Multiple sentences are delimited by a pipe.
sentence "|" (multiple/sentence) <Multiple>
end
rule literal
# A literal string contains of word characters (a-z) and/or spaces.
# Expand the character class to allow other characters too.
[a-zA-Z ]+ <Literal>
end
end
语法上述需要一个伴随文件,定义,使我们能够访问该节点值的类(另存为sentence_nodes.rb
)。
class Sentence < Treetop::Runtime::SyntaxNode
def combine(a, b)
return b if a.empty?
a.inject([]) do |values, val_a|
values + b.collect { |val_b| val_a + val_b }
end
end
def values
elements.inject([]) do |values, element|
combine(values, element.values)
end
end
end
class Parenthesised < Treetop::Runtime::SyntaxNode
def values
elements[1].values
end
end
class Multiple < Treetop::Runtime::SyntaxNode
def values
elements[0].values + elements[2].values
end
end
class Literal < Treetop::Runtime::SyntaxNode
def values
[text_value]
end
end
以下示例程序显示解析您给出的例句非常简单。
require "rubygems"
require "treetop"
require "sentence_nodes"
str = 'maybe (this is|that was) some' +
' ((nice|ugly) (day|night)|(strange (weather|time)))'
Treetop.load "sentences"
if sentence = SentencesParser.new.parse(str)
puts sentence.values
else
puts "Parse error"
end
这个程序的输出是:
maybe this is some nice day
maybe this is some nice night
maybe this is some ugly day
maybe this is some ugly night
maybe this is some strange weather
maybe this is some strange time
maybe that was some nice day
maybe that was some nice night
maybe that was some ugly day
maybe that was some ugly night
maybe that was some strange weather
maybe that was some strange time
您也可以访问语法树:
p sentence
The output is here。
你有它:一个可扩展的解析解决方案,应该在50行左右的代码中完成你想做的事情。这有帮助吗?
谢谢,我已经阅读了网上的例子,但我不明白我怎么能读嵌套圆括号...... – astropanic 2010-03-04 14:56:11
谢谢你!你是我的英雄:) – astropanic 2010-03-04 19:55:34
http://www.bestechvideos.com/2008/07/18/rubyconf-2007-treetop-syntactic-analysis-with-ruby,不错的视频 – astropanic 2010-03-05 06:37:23