|
List Info
Thread: "ocaml_beginners"::[] Ocamlyacc isn't liking me...
|
|
| "ocaml_beginners"::[]
Ocamlyacc isn't liking me... |
  United States |
2008-03-11 13:19:59 |
|
It zips through all of the tokens and throws a Parsing error once it
hits the end.
Where can I find some guidelines to help in constructing my grammar?
I've found some great documentation on ocamlyacc in general, but I'm
not running across much in the way of grammar guidelines...
I know I'm being vague and I apologize for that. I might have the
source to a point soon where I can share it, so maybe I'll re-post in
a few days with some sample source.
Also... how close is Bison to ocamlyacc? Wouldn't the concepts be
close enough to help in building the grammar? I haven't looked at
their documentation, so I don't know if it has what I'm looking for
anyway.
Should I try to get my hands on this 'Dragon' book I've seen mentioned
everywhere?
-Rich
P.s. - If I construct an LR(1) grammar, that's good enough for the
LALR(1) of ocamlyacc - right?
__._,_.___
.
__,_._,___
|
| Re: "ocaml_beginners"::[]
Ocamlyacc isn't liking me... |
  United States |
2008-03-11 13:34:24 |
|
On 11-Mar-08, at 2:19 PM, Richard Lyman wrote:
> Should I try to get my hands on this 'Dragon' book I've seen mentioned
> everywhere?
>
If you're talking about the Aho/Ullman Dragon book, it's very...
dense. I read it (and even got my copy signed), but I've been told by
friends who make compilers for a living that it is now both a smidge
antiquated _and_ hard to read on top of that.
My personal advice would be to read it for posterity's sake at some
point, but it's probably not your best bet if you are a novice
compiler constructor.
Apparently, there is a new edition available as of 2006. I can't
comment on that one, it may be much better now.
I haven't played much with ocamlyacc as of yet, but I've seen talk
about it on this list before, so I'm sure someone smarter and more
experienced than me will come along shortly and give you some hints.
-Mike
>
>
> -Rich
> P.s. - If I construct an LR(1) grammar, that's good enough for the
> LALR(1) of ocamlyacc - right?
>
>
__._,_.___
.
__,_._,___
|
| Re: "ocaml_beginners"::[]
Ocamlyacc isn't liking me... |
  Germany |
2008-03-11 14:32:04 |
|
Hello Richard,
Zitat von Richard Lyman < richard.lyman%40gmail.com">richard.lyman gmail.com>:
> It zips through all of the tokens and throws a Parsing error once it
> hits the end.
[...]
Well, possibly you missed something that made the grammar correct?
Without an example (including sozrces as well as some input-exmples),
it is not that easy to help you.
>
> Where can I find some guidelines to help in constructing my grammar?
[...]
I would recommend three things:
1) Yacc-tutorial (the original Unix-stuff from some decades ago,
AT&T Bell labs)
2) Ocamlyacc tutorial by SooHyoung Oh
3) The Book "ley & yacc" from OReilly, if you are not a
C-allergic person 
You can find 1) (together with other tutorials) here:
http://dinosaur.compilertools.net/yacc/index.html
You can find 2) here:
http://plus.kaist.ac.kr/~shoh/ocaml/ocamllex-ocamlyacc/ocamlyacc-tutorial/
You can find 3) here:
http://www.oreilly.com/catalog/lex/index.html
In the book there are examples on how to ctreate grammars,
as well as make them better as well as hints to solviing
typical problems (e.g. if-then vs if-then-else parsing).
IMHO this book is a good starting point.
The Dragon book, IMHO is goind very deep into the basic things.
If you want to write special stuff, this might be of interest,
otherwise it's disproportionate for practical programming.
I think it's going far beyond the possibilities of lex and yacc.
[...].
>
> Also... how close is Bison to ocamlyacc?
As far as I know, bison is an enhanced yacc, but I have never used it,
so I'm vague here 
> Wouldn't the concepts be
> close enough to help in building the grammar?
Look in 1) and especially (3) for the grammar stuff.
Look in 2) for ocamlyacc-specific things.
Ciao,
Oliver
__._,_.___
.
__,_._,___
|
| "ocaml_beginners"::[] Re:
Ocamlyacc isn't liking me... |
  United States |
2008-03-11 15:21:06 |
|
Hi Richard,
> I've found some great documentation on ocamlyacc in general, but I'm
> not running across much in the way of grammar guidelines...
Any tutorial on Yacc/Bison applies also to Ocamlyacc. There's
quite a few of those on the net, plus the O'Reilly book that
has been mentioned already.
In any case, I recommend you try using Menhir instead of Ocamlyacc.
It is mostly backwards compatible, and includes a number of options
that make debugging a lot easier. Moreover, it is already available
on GODI. Here's the homepage:
http://cristal.inria.fr/~fpottier/menhir/
> P.s. - If I construct an LR(1) grammar, that's good enough for the
> LALR(1) of ocamlyacc - right?
Menhir is a LR(1) parser generator. So you won't have to worry
about that issue...
Cheers,
Dario
__._,_.___
.
__,_._,___
|
| Re: "ocaml_beginners"::[] Re:
Ocamlyacc isn't liking me... |
  United States |
2008-03-11 15:37:59 |
|
I'd be interested in Menhir, but their site mentions:
"Warning: the current release is BETA quality"
Everybody has their own definition of Beta quality so it's possible
that theirs allows for more stable code...
Have you run into any problems with that?
-Rich
On Tue, Mar 11, 2008 at 2:21 PM, darioteixeira < darioteixeira%40yahoo.com">darioteixeira yahoo.com> wrote:
<snip>
>
> In any case, I recommend you try using Menhir instead of Ocamlyacc.
> It is mostly backwards compatible, and includes a number of options
> that make debugging a lot easier. Moreover, it is already available
> on GODI. Here's the homepage:
> http://cristal.inria.fr/~fpottier/menhir/
>
<snip>
>
> Cheers,
> Dario
__._,_.___
.
__,_._,___
|
| Re: "ocaml_beginners"::[]
Ocamlyacc isn't liking me... |
  United States |
2008-03-12 22:45:46 |
|
So, what I'm wanting is to transform some XML.
I know there are libraries to handle this task - I'm only using XML
since I thought it would be something simple to lex and parse. I'm
hoping that I will just get a string of numbers 0-9 that correspond
with the non-terminals that were visited. Here's the code that isn't
doing that...
(... and in the end I'm hoping that I can just have regular OCaml code
that writes out to a file or a sqlite DB for each production, instead
of concatenating sequences of number-strings for each production... )
Main.ml:
let linebuf = Lexing.from_string
"<?xml version="1.0" encoding="UTF-8"?>
<root>
<child/>
<child/>
</root>" in
while true do
try
Printf.printf "%sn%!" (Parser.main Lexer.token linebuf)
with
| Lexer.Error msg -> print_endline "Lexer error"
| Parser.Error -> print_endline "Parser error"
done
Lexer.mll:
{
open Str
open Parser
open Printf
exception Error of string
let line = ref 0
let file = ref ""
let incLine lexbuf =
let pos = lexbuf.Lexing.lex_curr_p in
lexbuf.Lexing.lex_curr_p <- {
pos with
Lexing.pos_lnum = pos.Lexing.pos_lnum + 1;
Lexing.pos_bol = pos.Lexing.pos_cnum;
}
let usefulError point lexbuf =
let lpos = (Lexing.lexeme_start_p lexbuf).Lexing.pos_bol in
let pos = (Lexing.lexeme_start_p lexbuf).Lexing.pos_cnum - lpos in
let charIndicator =
if( pos <= 0 ) then
"^"
else
((String.make pos ' ') ^ "^") in
let line = (Lexing.lexeme_start_p lexbuf).Lexing.pos_lnum in
let ic = (open_in !file) in
let ignore = seek_in ic lpos in
let context = input_line ic in
failwith( sprintf "File '%s'nLine %dnCharacter %d ('%c')n%sn%s"
!file line pos point context charIndicator )
}
let openBracket = '<'
let eol = ['n' 'r' ' 13']+
let ops = [' ' 't']*
let reqs = [' ' 't']+
let alpha = ['A'-'Z' 'a'-'z']
let alphanum = (alpha | ['0'-'9'])
let extraChars = ['.' '-' ':' '\' '/' ',' ' ' '[' ']' '_' '(' ')' '&'
';' '='] (* Dbl Quote is the only thing that should stay out *)
let attributeValue = (alphanum | extraChars )
let attribute = alpha+ ops '=' ops '"' attributeValue* '"'
let attributeList = (reqs attribute)*
rule token = parse
| "<?" { print_endline "OPEN_PD"; OPEN_PD; token lexbuf }
| "?>" { print_endline "CLOSE_PD"; CLOSE_PD; token lexbuf }
| "<" { print_endline "LT"; LT; token lexbuf }
| (alpha+ as name) { print_endline ("IDENT "^name); IDENT name;
token lexbuf }
| '"' (attributeValue* as value) '"' { print_endline ("QUOTE STRING
QUOTE "^value); QUOTE; STRING value; QUOTE; token lexbuf }
| '=' { print_endline "EQUAL"; EQUAL; token lexbuf }
| "</" { print_endline "CLOSE_NODE"; CLOSE_NODE; token lexbuf }
| "/>" { print_endline "CLOSE_NODE"; CLOSE_NODE; token lexbuf }
| ">" { print_endline "GT"; GT; token lexbuf }
| reqs { print_endline "WHITESPACE"; WHITESPACE; token lexbuf }
| eol { incLine lexbuf; token lexbuf }
| eof { print_endline "Reached eof"; EOF; exit 0 }
| _ as point { usefulError point lexbuf }
Parser.mly:
%token <string> IDENT
%token <string> STRING
%token WHITESPACE OPEN_PD CLOSE_PD GT LT CLOSE_NODE QUOTE EQUAL EOF
%start main
%type <string> main
%type <string> pd
%type <string> attribute
%type <string> attlist
%type <string> node_start
%type <string> node_end
%type <string> node_list
%type <string> root
%%
pd:
| OPEN_PD IDENT WHITESPACE attlist CLOSE_PD { "1" }
attribute:
| IDENT EQUAL QUOTE STRING QUOTE { "2" }
attlist:
| { "3" }
| attribute { "4" }
| attlist WHITESPACE attribute { "5" }
node_start:
| LT IDENT WHITESPACE attlist GT { "6" }
node_end:
| CLOSE_NODE IDENT GT { "7" }
node_list:
| { "8" }
| root { $1^"9" }
root:
| node_start node_list node_end { $1^$2^$3^& | |