List Info

Thread: "ocaml_beginners"::[] Reporting line number on parser errors




"ocaml_beginners"::[] Reporting line number on parser errors
country flaguser name
United States
2007-02-26 09:02:14

Folks,

How do you report an error from a parser so that it includes the line
number?

I have code like this in the "header" of my lexer.mll but I'm
wondering how I can make use of line number information from my parser.

let start = ref 1 and line = ref 1

let newline lexbuf =
start := lexbuf.Lexing.lex_curr_p.Lexing.pos_cnum;
incr line

let pos lexbuf = Lexing.lexeme_start lexbuf - !start

let report_error c lexbuf =
failwith ("Invalid character: '" ^ String.make 1 c ^ "'" ^
"at line " ^ string_of_int !line ^
", position " ^ string_of_int (pos lexbuf))

Thanks, Joel

--
http://wagerlabs.com/

__._,_.___
.

__,_._,___
Re: "ocaml_beginners"::[] Reporting line number on parser errors
country flaguser name
United States
2007-02-26 09:35:07

> How do you report an error from a parser so that it includes the line
>; number?
>
> I have code like this in the "header" of my lexer.mll but I'm
> wondering how I can make use of line number information from my parser.
>

You define your own token type, so just add a line_number field to your tokens.

It seems that there are a lot of little things like this, that every
compiler writer needs to do, but which need to be done from scratch in
lex, yacc, and derivatives like ocamllex/ocamlyacc. Another big issue
is creating appropriate error messages when an expression "almost"
matches one of the known patterns or when an undefined symbol "almost"
matches a defined symbol. Have you run across any better compiler
generation tools or (30+ years after yacc was cutting edge) is it time
to create something new?

__._,_.___
.

__,_._,___
Re: "ocaml_beginners"::[] Reporting line number on parser errors
country flaguser name
United States
2007-02-26 13:06:25

On Mon, 26 Feb 2007, Joel Reymont wrote:

> Folks,
>
> How do you report an error from a parser so that it includes the line
>; number?
>
> I have code like this in the "header" of my lexer.mll but I'm
> wondering how I can make use of line number information from my parser.
>
> let start = ref 1 and line = ref 1
>
> let newline lexbuf =
> start := lexbuf.Lexing.lex_curr_p.Lexing.pos_cnum;
> incr line
>;
> let pos lexbuf = Lexing.lexeme_start lexbuf - !start
>
> let report_error c lexbuf =
> failwith ("Invalid character: '" ^ String.make 1 c ^ "'" ^
> "at line " ^ string_of_int !line ^
> ", position " ^ string_of_int (pos lexbuf))

Basically, your ocamllex lexer should keep track of the new lines by
setting the appropriate field in the lexbuf every time it encounters a new
line (by default the whole file is seen as only one big line). The other
thing is to read the location from the lexbuf and print an error message.

Also, you must set the file name in lexbuf before starting (and you may
change it during parsing if you encounter something like a #line
directive).

Look at the source-code of json-wheel
(http://martin.jambon.free.fr/json-wheel-1.0.2.tar.gz)
from which I extracted the functions that you need:

(* read token location from lexbuf *)
let loc lexbuf = (lexbuf.lex_start_p, lexbuf.lex_curr_p)

(* format location for display *)
let string_of_loc (pos1, pos2) =
let line1 = pos1.pos_lnum
and start1 = pos1.pos_bol in
Printf.sprintf "File %S, line %i, characters %i-%i";
pos1.pos_fname line1
(pos1.pos_cnum - start1)
(pos2.pos_cnum - start1)

(* count the new line *)
let newline lexbuf =
let pos = lexbuf.lex_curr_p in
lexbuf.lex_curr_p <- { pos with
pos_lnum = pos.pos_lnum + 1;
pos_bol = pos.pos_cnum }

(* set file name *)
let set_file_name lexbuf name =
lexbuf.lex_curr_p <- { lexbuf.lex_curr_p with pos_fname = name }

Martin

--
Martin Jambon
http://martin.jambon.free.fr

__._,_.___
.

__,_._,___
Re: "ocaml_beginners"::[] Reporting line number on parser errors
country flaguser name
United States
2007-02-26 16:02:40


On Feb 26, 2007, at 7:06 PM, Martin Jambon wrote:

> Look at the source-code of json-wheel
> (http://martin.jambon.free.fr/json-wheel-1.0.2.tar.gz)
> from which I extracted the functions that you need:

Martin's project is great to study and reports error location both in
the lexer and the parser.

Joel

--
http://wagerlabs.com/

__._,_.___
.

__,_._,___
Re: "ocaml_beginners"::[] Reporting line number on parser errors
country flaguser name
United States
2007-02-26 15:50:48


On Feb 26, 2007, at 7:06 PM, Martin Jambon wrote:

> Basically, your ocamllex lexer should keep track of the new lines by
> setting the appropriate field in the lexbuf every time it
> encounters a new
> line (by default the whole file is seen as only one big line). The
> other
&gt; thing is to read the location from the lexbuf and print an error
> message.

How do I make use of the location functions from the parser, though?

Do I create a lexer.mli and put loc and string_of_loc signatures
there together with some function that exports Lexer.lexbuf?

&gt; (* count the new line *)
> let newline lexbuf =
> let pos = lexbuf.lex_curr_p in
> lexbuf.lex_curr_p <- { pos with
>; pos_lnum = pos.pos_lnum + 1;
> pos_bol = pos.pos_cnum }

I'll have to read up more on ocamllex but I wonder if pos_lnum and
pos_bol are documented. Do these fields exist specifically for me to
make use of them?

Thanks, Joel

--
http://wagerlabs.com/

__._,_.___
.

__,_._,___
Re: "ocaml_beginners"::[] Reporting line number on parser errors
country flaguser name
United States
2007-02-26 16:29:40

On Mon, 26 Feb 2007, Joel Reymont wrote:

> On Feb 26, 2007, at 7:06 PM, Martin Jambon wrote:
&gt;
> > Basically, your ocamllex lexer should keep track of the new lines by
> > setting the appropriate field in the lexbuf every time it
> > encounters a new
> > line (by default the whole file is seen as only one big line). The
> > other
&gt; > thing is to read the location from the lexbuf and print an error
&gt; > message.
>
>; How do I make use of the location functions from the parser, though?

Yes, I forgot about this. It's the rhs_loc function in json_parser.mly (I
took it directly from the parser for the OCaml compiler):

let rhs_loc n = (Parsing.rhs_start_pos n, Parsing.rhs_end_pos n)

where n is the number of the token in the grammar rule (starting from 1).

Note also the "error" pseudo-terminal that is used to create rules for
common syntax errors.

Camlp4 parsers that use pa_extend are better at that since they
produce sensible error messages all by itself by telling you what was
expected. But Camlp4 has its own set of difficulties.

Maybe this is where Menhir is better than ocamlyacc, I don't
know, I haven't tried. http://cristal.inria.fr/~fpottier/menhir/

>; Do I create a lexer.mli and put loc and string_of_loc signatures
> there together with some function that exports Lexer.lexbuf?
>
> > (* count the new line *)
> > let newline lexbuf =
> > let pos = lexbuf.lex_curr_p in
> > lexbuf.lex_curr_p <- { pos with
>; > pos_lnum = pos.pos_lnum + 1;
> > pos_bol = pos.pos_cnum }
>
> I'll have to read up more on ocamllex but I wonder if pos_lnum and
> pos_bol are documented. Do these fields exist specifically for me to
> make use of them?

You must read the documentation for the Lexing module of the standard
library.

Martin

--
Martin Jambon
http://martin.jambon.free.fr

__._,_.___
.

__,_._,___
[1-6]

about | contact  Other archives ( Real Estate discussion Medical topics )