[ProgSoc] Context-affinitive parsing w/ context-free grammar
John Elliot
jj5 at jj5.net
Thu Sep 24 00:28:09 EST 2009
I have a tricky problem that someone might be able to offer some advice
on. I'm working on a parser using coco/r [1]. I've read the user manual
[2]. Say I have an example program in my programming language like this
[a] where I'm listing major items "foo" and "bar" and their sub items
"a", "b", "reserved_word", etc.
Using an ATG like this [b], an executable like this [c], and compiling
like this [d]; how do I go about supporting the sub item named
"reserved_word"?
What I've done in the mean time is to re-write the ATG supporting things
the C#esque way like this [e], but I think it would be cool if I could
support the use of the reserved word as the name of a sub item. The only
way I can think of to do that would be to take over the scanner and
parse manually for sub items when in the context of a "reserved_word"
(but that would be tricky, especially since I'd have to also support
pragmas, comments, etc.), but that might be harder than necessary..? Is
there some way to use coco/r and the ATG to support context-affinitive
parsing, wherein the parsing of a "reserved_word" would exclude the
possibility of encountering another "reserved_word" thus enabling the
use of the symbol "reserved_word" as a sub item name?
Love,
Confused in Sydney.
[a:example.syn]
reserved_word foo {
sub_item a,
sub_item b,
sub_item reserved_word,
}
reserved_word bar {
sub_item x,
sub_item y,
sub_item z
}
[b:context.atg]
COMPILER Example
CHARACTERS
letter = "abcdefghijklmnopqrstuvwxyz".
cr = '\r'.
lf = '\n'.
tab = '\t'.
TOKENS
ident = letter {letter}.
// reserved words:
lbrace = '{'.
rbrace = '}'.
comma = ','.
reserved_word = "reserved_word".
sub_item = "sub_item".
IGNORE cr + lf + tab
PRODUCTIONS
////////////////////////////////////////////////////////////////////
Example
=
( ReservedWord )
.
////////////////////////////////////////////////////////////////////
ReservedWord
= (. String name = null; .)
reserved_word (.
Console.WriteLine( "Matched reserved word." ); .)
ident (.
name = t.val; Console.WriteLine( "r:" + t.val ); .)
lbrace
{
SubItem [ comma ]
}
rbrace
.
////////////////////////////////////////////////////////////////////
SubItem
= (. String name = null; .)
sub_item (.
Console.WriteLine( "Matched sub item." ); .)
ident (.
name = t.val; Console.WriteLine( "s:" + t.val ); .)
.
END Example.
[c:context.cs]
using System;
namespace example {
internal static class EntryPoint {
internal static void Main ( String[] args ) {
Scanner scanner = new Scanner( args[0] );
Parser parser = new Parser( scanner );
parser.Parse();
}
}
}
[d:build.sh]
#!/bin/bash
mono coco/coco.exe context.atg -namespace example
gmcs -debug+ -warn:2 context.cs Scanner.cs Parser.cs
mono context.exe example.syn
[e:example.fix]
reserved_word foo {
sub_item a,
sub_item b,
sub_item @reserved_word,
}
reserved_word bar {
sub_item x,
sub_item y,
sub_item z
}
[1] http://ssw.jku.at/coco/
[2] http://ssw.jku.at/Coco/Doc/UserManual.pdf
More information about the Progsoc
mailing list