I have a lexer and parser that I want to modify to make the last token
optional. Currently, the mly file has like the following:
(***** begin mly file *****)
main:
stmtlist EOF
;
stmtlist:
| stmt stmtlist
;
stmt:
stmtend
| exp stmtend
(* many more rules *)
;
stmtend:
ENDTAG
;
(* many more rules *)
(***** end mly file *****)
For my setting, ENDTAG is not strictly necessary at the end of the
file. The grammar is pretty big, and refactoring it would be a lot of
work. I would really like to change the rules as follows:
(***** begin changes *****)
main:
stmtlist EOF1 EOF2
| stmtlist EOF2
;
stmtend:
ENDTAG
| EOF1
;
(***** end changes *****)
Making this change successfully would require that the lexer returns
two tokens on eof. Currently, the mll file has a line such as:
(***** begin mll example *****)
| eof
(***** end mll example *****)
I'm trying to replace it with something like:
(***** begin mll changes *****)
let hit_eof = ref false
(* skip lots of stuff *)
| eof {
if !hit_eof
then (EOF2)
else (hit_eof := true; EOF1) }
(***** end mll changes *****)
However, when I compile the system with these changes and run it on a
file (my_file) that omits "ENDTAG", I get the following error:
Fatal error: exception Failure("Parser error (lexing: empty token):
file "./my_file" line 144")
Can anyone tell me how to make the lexer (compiled with ocamllex)
return two tokens on eof? I read in another post that eof is just a
flag that can be read multiple times, but that doesn't seem to be the
case (or perhaps I did not understand it correctly). Thanks in advance.
--Gary
.