From: Robert Varga <robert.varga@pantheon.tech>
Date: Sat, 16 Oct 2021 14:37:07 +0000 (+0200)
Subject: Update ANTLR grammar documentation
X-Git-Tag: v8.0.0~187
X-Git-Url: https://git.opendaylight.org/gerrit/gitweb?a=commitdiff_plain;h=14a6cffdf041e14ea2940306502fd762f7decc6c;p=yangtools.git

Update ANTLR grammar documentation

We have a mis-reference and its parser section needs point out that
the end result is not a complete YANG structure as per ABNF, but rather
the equivalent of lexer stream, which is interpreted in Java code.

Change-Id: I36619c864feba8b1583c83bc2ab2cab465b6bba5
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
---

diff --git a/parser/yang-parser-antlr/src/main/antlr4/org/opendaylight/yangtools/yang/parser/antlr/YangStatementLexer.g4 b/parser/yang-parser-antlr/src/main/antlr4/org/opendaylight/yangtools/yang/parser/antlr/YangStatementLexer.g4
index 94ebabc6ed..fbb18d0ca8 100644
--- a/parser/yang-parser-antlr/src/main/antlr4/org/opendaylight/yangtools/yang/parser/antlr/YangStatementLexer.g4
+++ b/parser/yang-parser-antlr/src/main/antlr4/org/opendaylight/yangtools/yang/parser/antlr/YangStatementLexer.g4
@@ -41,7 +41,7 @@ PLUS : '+' -> type(PLUS);
 //   Note that inside a quoted string (Section 6.1.3), these character
 //   pairs are never interpreted as the start or end of a comment.
 //
-// What constitutes 'end of the line' is not specified in RFC7950, hence
+// What constitutes 'end of the line' is not specified in RFC6020, hence
 // we are using RFC7950-clarified definition. Note we also need to handle
 // the case of EOF, as the user may not have included a newline.
 LINE_COMMENT : '//' .*? '\r'? ('\n' | EOF) -> skip;
diff --git a/parser/yang-parser-antlr/src/main/antlr4/org/opendaylight/yangtools/yang/parser/antlr/YangStatementParser.g4 b/parser/yang-parser-antlr/src/main/antlr4/org/opendaylight/yangtools/yang/parser/antlr/YangStatementParser.g4
index 69148f618c..1b826460f5 100644
--- a/parser/yang-parser-antlr/src/main/antlr4/org/opendaylight/yangtools/yang/parser/antlr/YangStatementParser.g4
+++ b/parser/yang-parser-antlr/src/main/antlr4/org/opendaylight/yangtools/yang/parser/antlr/YangStatementParser.g4
@@ -7,6 +7,11 @@
 //
 parser grammar YangStatementParser;
 
+// This ANTLR4 grammar serves as the lexer for the actual YANG parser, which
+// is built on top of it in Java code. The reason for this split is that we
+// need to perform string interpretation in a way which takes into account
+// yang-version, since RFC7950 (YANG 1.1) is stricter (and saner) when it comes
+// to escaping and similar.
 options {
     tokenVocab = YangStatementLexer;
 }
@@ -33,7 +38,7 @@ argument :
     // Quoted string and concatenations thereof. We are sacrificing brewity
     // here to eliminate the need for another parser construct. Quoted strings
     // account for about 50% of all arguments encountered -- hence the added
-    // parse tree indirection is very visible.
+    // parse tree indirection is very visible in terms of memory usage.
     (DQUOT_STRING? DQUOT_END | SQUOT_STRING? SQUOT_END)
     (SEP* PLUS SEP* (DQUOT_STRING? DQUOT_END | SQUOT_STRING? SQUOT_END))*
     |
@@ -54,7 +59,7 @@ unquotedString :
     // having one level for each such concatenation. For a test case imagine
     // how "a*b/c*d*e**f" would get parsed with a recursive grammar.
     //
-    // Now we cannot do much aboud tokenization, but we can statically express
+    // Now we cannot do much about tokenization, but we can statically express
     // the shape we are looking for:
 
     //   so an unquoted string may optionally start with a single SLASH or any