我需要解析的规则手册中包含“demo.rb”文件:如何调试ANTLR4语法无关的/不匹配的输入错误
rulebook Titanic-Normalization {
version 1
meta {
description "Test"
source "my-rules.xslx"
user "joltie"
}
rule remove-first-line {
description "Removes first line when offset is zero"
when(present(offset) && offset == 0) then {
filter-row-if-true true;
}
}
}
我写的ANTLR4语法文件Rulebook.g4像下面。目前,它可以很好地解析* .rb文件,但遇到“表达式”/“语句”规则时会引发意外错误。
grammar Rulebook;
rulebookStatement
: KWRulebook
(GeneralIdentifier | Identifier)
'{'
KWVersion
VersionConstant
metaStatement
(ruleStatement)+
'}'
;
metaStatement
: KWMeta
'{'
KWDescription
StringLiteral
KWSource
StringLiteral
KWUser
StringLiteral
'}'
;
ruleStatement
: KWRule
(GeneralIdentifier | Identifier)
'{'
KWDescription
StringLiteral
whenThenStatement
'}'
;
whenThenStatement
: KWWhen '(' expression ')'
KWThen '{' statement '}'
;
primaryExpression
: GeneralIdentifier
| Identifier
| StringLiteral+
| '(' expression ')'
;
postfixExpression
: primaryExpression
| postfixExpression '[' expression ']'
| postfixExpression '(' argumentExpressionList? ')'
| postfixExpression '.' Identifier
| postfixExpression '->' Identifier
| postfixExpression '++'
| postfixExpression '--'
;
argumentExpressionList
: assignmentExpression
| argumentExpressionList ',' assignmentExpression
;
unaryExpression
: postfixExpression
| '++' unaryExpression
| '--' unaryExpression
| unaryOperator castExpression
;
unaryOperator
: '&' | '*' | '+' | '-' | '~' | '!'
;
castExpression
: unaryExpression
| DigitSequence // for
;
multiplicativeExpression
: castExpression
| multiplicativeExpression '*' castExpression
| multiplicativeExpression '/' castExpression
| multiplicativeExpression '%' castExpression
;
additiveExpression
: multiplicativeExpression
| additiveExpression '+' multiplicativeExpression
| additiveExpression '-' multiplicativeExpression
;
shiftExpression
: additiveExpression
| shiftExpression '<<' additiveExpression
| shiftExpression '>>' additiveExpression
;
relationalExpression
: shiftExpression
| relationalExpression '<' shiftExpression
| relationalExpression '>' shiftExpression
| relationalExpression '<=' shiftExpression
| relationalExpression '>=' shiftExpression
;
equalityExpression
: relationalExpression
| equalityExpression '==' relationalExpression
| equalityExpression '!=' relationalExpression
;
andExpression
: equalityExpression
| andExpression '&' equalityExpression
;
exclusiveOrExpression
: andExpression
| exclusiveOrExpression '^' andExpression
;
inclusiveOrExpression
: exclusiveOrExpression
| inclusiveOrExpression '|' exclusiveOrExpression
;
logicalAndExpression
: inclusiveOrExpression
| logicalAndExpression '&&' inclusiveOrExpression
;
logicalOrExpression
: logicalAndExpression
| logicalOrExpression '||' logicalAndExpression
;
conditionalExpression
: logicalOrExpression ('?' expression ':' conditionalExpression)?
;
assignmentExpression
: conditionalExpression
| unaryExpression assignmentOperator assignmentExpression
| DigitSequence // for
;
assignmentOperator
: '=' | '*=' | '/=' | '%=' | '+=' | '-=' | '<<=' | '>>=' | '&=' | '^=' | '|='
;
expression
: assignmentExpression
| expression ',' assignmentExpression
;
statement
: expressionStatement
;
expressionStatement
: expression+ ';'
;
KWRulebook: 'rulebook';
KWVersion: 'version';
KWMeta: 'meta';
KWDescription: 'description';
KWSource: 'source';
KWUser: 'user';
KWRule: 'rule';
KWWhen: 'when';
KWThen: 'then';
KWTrue: 'true';
KWFalse: 'false';
fragment
LeftParen : '(';
fragment
RightParen : ')';
fragment
LeftBracket : '[';
fragment
RightBracket : ']';
fragment
LeftBrace : '{';
fragment
RightBrace : '}';
Identifier
: IdentifierNondigit
( IdentifierNondigit
| Digit
)*
;
GeneralIdentifier
: Identifier
('-' Identifier)+
;
fragment
IdentifierNondigit
: Nondigit
//| // other implementation-defined characters...
;
VersionConstant
: DigitSequence ('.' DigitSequence)*
;
DigitSequence
: Digit+
;
fragment
Nondigit
: [a-zA-Z_]
;
fragment
Digit
: [0-9]
;
StringLiteral
: '"' SCharSequence? '"'
| '\'' SCharSequence? '\''
;
fragment
SCharSequence
: SChar+
;
fragment
SChar
: ~["\\\r\n]
| '\\\n' // Added line
| '\\\r\n' // Added line
;
Whitespace
: [ \t]+
-> skip
;
Newline
: ( '\r' '\n'?
| '\n'
)
-> skip
;
BlockComment
: '/*' .*? '*/'
-> skip
;
LineComment
: '//' ~[\r\n]*
-> skip
;
我测试的规则手册解析器与单元测试象下面这样:
public void testScanRulebookFile() throws IOException {
String fileName = "C:\\rulebooks\\demo.rb";
FileInputStream fis = new FileInputStream(fileName);
// create a CharStream that reads from standard input
CharStream input = CharStreams.fromStream(fis);
// create a lexer that feeds off of input CharStream
RulebookLexer lexer = new RulebookLexer(input);
// create a buffer of tokens pulled from the lexer
CommonTokenStream tokens = new CommonTokenStream(lexer);
// create a parser that feeds off the tokens buffer
RulebookParser parser = new RulebookParser(tokens);
RulebookStatementContext context = parser.rulebookStatement();
// WhenThenStatementContext context = parser.whenThenStatement();
System.out.println(context.toStringTree(parser));
// ParseTree tree = parser.getContext(); // begin parsing at init rule
// System.out.println(tree.toStringTree(parser)); // print LISP-style tree
}
对于“demo.rb”如上述,解析器得到了错误如下。我还以toStringTree的形式打印RulebookStatementContext。
line 12:25 mismatched input '&&' expecting ')'
(rulebookStatement rulebook Titanic-Normalization { version 1 (metaStatement meta { description "Test" source "my-rules.xslx" user "joltie" }) (ruleStatement rule remove-first-line { description "Removes first line when offset is zero" (whenThenStatement when ((expression (assignmentExpression (conditionalExpression (logicalOrExpression (logicalAndExpression (inclusiveOrExpression (exclusiveOrExpression (andExpression (equalityExpression (relationalExpression (shiftExpression (additiveExpression (multiplicativeExpression (castExpression (unaryExpression (postfixExpression (postfixExpression (primaryExpression present)) ((argumentExpressionList (assignmentExpression (conditionalExpression (logicalOrExpression (logicalAndExpression (inclusiveOrExpression (exclusiveOrExpression (andExpression (equalityExpression (relationalExpression (shiftExpression (additiveExpression (multiplicativeExpression (castExpression (unaryExpression (postfixExpression (primaryExpression offset)))))))))))))))))))))))))))))))))) && offset == 0) then { filter-row-if-true true ;) }) })
我也写单元测试,以测试等"when (offset == 0) then {\n" + "filter-row-if-true true;\n" + "}\n"
短输入上下文调试问题。但它仍然有像错误:
line 1:16 mismatched input '0' expecting {'(', '++', '--', '&&', '&', '*', '+', '-', '~', '!', Identifier, GeneralIdentifier, DigitSequence, StringLiteral}
line 2:19 extraneous input 'true' expecting {'(', '++', '--', '&&', '&', '*', '+', '-', '~', '!', ';', Identifier, GeneralIdentifier, DigitSequence, StringLiteral}
有两个一天的尝试,我没有得到任何进展。问题是只要以上,请有人给我一些建议如何调试ANTLR4语法无关/不匹配的输入错误。
感谢您的详细帮助。我已经尝试了1. 2.5分,但我认为我应该尝试更多地分解输入和语法。如果你有空,你能帮我解释一下这个具体的例子吗? – Zhenglinj
你有没有看过解析树? – Raven
是的。我在第5点中指出了解析树。我也在stackoverflow中读取了许多相同的问题,但结果很奇怪。我试图用简单的语法和简单的输入来编写“whenThenStatement”规则单元测试。 – Zhenglinj