From: Robert Varga Date: Sat, 16 Oct 2021 14:37:07 +0000 (+0200) Subject: Update ANTLR grammar documentation X-Git-Tag: v8.0.0~187 X-Git-Url: https://git.opendaylight.org/gerrit/gitweb?a=commitdiff_plain;h=14a6cffdf041e14ea2940306502fd762f7decc6c;p=yangtools.git Update ANTLR grammar documentation We have a mis-reference and its parser section needs point out that the end result is not a complete YANG structure as per ABNF, but rather the equivalent of lexer stream, which is interpreted in Java code. Change-Id: I36619c864feba8b1583c83bc2ab2cab465b6bba5 Signed-off-by: Robert Varga --- diff --git a/parser/yang-parser-antlr/src/main/antlr4/org/opendaylight/yangtools/yang/parser/antlr/YangStatementLexer.g4 b/parser/yang-parser-antlr/src/main/antlr4/org/opendaylight/yangtools/yang/parser/antlr/YangStatementLexer.g4 index 94ebabc6ed..fbb18d0ca8 100644 --- a/parser/yang-parser-antlr/src/main/antlr4/org/opendaylight/yangtools/yang/parser/antlr/YangStatementLexer.g4 +++ b/parser/yang-parser-antlr/src/main/antlr4/org/opendaylight/yangtools/yang/parser/antlr/YangStatementLexer.g4 @@ -41,7 +41,7 @@ PLUS : '+' -> type(PLUS); // Note that inside a quoted string (Section 6.1.3), these character // pairs are never interpreted as the start or end of a comment. // -// What constitutes 'end of the line' is not specified in RFC7950, hence +// What constitutes 'end of the line' is not specified in RFC6020, hence // we are using RFC7950-clarified definition. Note we also need to handle // the case of EOF, as the user may not have included a newline. LINE_COMMENT : '//' .*? '\r'? ('\n' | EOF) -> skip; diff --git a/parser/yang-parser-antlr/src/main/antlr4/org/opendaylight/yangtools/yang/parser/antlr/YangStatementParser.g4 b/parser/yang-parser-antlr/src/main/antlr4/org/opendaylight/yangtools/yang/parser/antlr/YangStatementParser.g4 index 69148f618c..1b826460f5 100644 --- a/parser/yang-parser-antlr/src/main/antlr4/org/opendaylight/yangtools/yang/parser/antlr/YangStatementParser.g4 +++ b/parser/yang-parser-antlr/src/main/antlr4/org/opendaylight/yangtools/yang/parser/antlr/YangStatementParser.g4 @@ -7,6 +7,11 @@ // parser grammar YangStatementParser; +// This ANTLR4 grammar serves as the lexer for the actual YANG parser, which +// is built on top of it in Java code. The reason for this split is that we +// need to perform string interpretation in a way which takes into account +// yang-version, since RFC7950 (YANG 1.1) is stricter (and saner) when it comes +// to escaping and similar. options { tokenVocab = YangStatementLexer; } @@ -33,7 +38,7 @@ argument : // Quoted string and concatenations thereof. We are sacrificing brewity // here to eliminate the need for another parser construct. Quoted strings // account for about 50% of all arguments encountered -- hence the added - // parse tree indirection is very visible. + // parse tree indirection is very visible in terms of memory usage. (DQUOT_STRING? DQUOT_END | SQUOT_STRING? SQUOT_END) (SEP* PLUS SEP* (DQUOT_STRING? DQUOT_END | SQUOT_STRING? SQUOT_END))* | @@ -54,7 +59,7 @@ unquotedString : // having one level for each such concatenation. For a test case imagine // how "a*b/c*d*e**f" would get parsed with a recursive grammar. // - // Now we cannot do much aboud tokenization, but we can statically express + // Now we cannot do much about tokenization, but we can statically express // the shape we are looking for: // so an unquoted string may optionally start with a single SLASH or any