|
List Info
Thread: "ocaml_beginners"::[] Byte-code fine; native-code segfaults
|
|
| "ocaml_beginners"::[]
Byte-code fine; native-code segfaults |
  United States |
2007-08-16 17:44:04 |
|
Hey,
When a programme compiled to native code with OCaml 3.09.3
segfaults at runtime, is there any way at all to make a
diagnostic of what caused the problem?
I have a small programme that uses the PG'OCaml library.
This programme is just stress-testing the library: it runs
an infinite loop, performing the same SELECT statement
over and over. Now, the byte-code compiled version is
the archetypical Duracell Bunny: it runs indefinitely
without any problems. The native-code version of the
same programme, however, crashes with a segmentation fault
after some 20 minutes.
I know that OCaml 3.10 supports stack traces of native
code, but I cannot use 3.10 because PG'OCaml relies on
the old Camlp4 syntax. I am thus stuck with 3.09.3.
The solutions I can think of are as follows:
a) Find a way to compile the old Camlp4 syntax with 3.10;
b) Find a way to debug native-code with 3.09;
c) Convince Rich to port PG'OCaml to 3.10. 
Any ideas?
Thanks for your attention,
C.S.
__._,_.___
.
__,_._,___
|
| Re: "ocaml_beginners"::[]
Byte-code fine; native-code segfaults |
  Germany |
2007-08-16 18:18:39 |
|
Zitat von cultural_sublimation < cultural_sublimation%40yahoo.com">cultural_sublimation yahoo.com>:
> Hey,
>
> When a programme compiled to native code with OCaml 3.09.3
> segfaults at runtime, is there any way at all to make a
> diagnostic of what caused the problem?
Well, if it is native code, you could let the system write
coredumps (ulimit -c <max_blocks> ) and use gdb to look at them.
You also should be able to run the program inside gdb
(invoked by gdb).
But I'm not sure how far gdb really can be used here.
At least gprof for time-profiling can be used together with OCaml
native code. Maybe there is also some kind of support for gdb?
You also could use a systemcall-tracer, like strace on
linux, to find some informations about it.
If you would run on solaris, dtrace possibly could help.
If somehow external libraries like efence could be used
(maybe by inserting self-written external C-code,
especially to insert efence-hooks into the OCaml-code),
this might be a way?
(But this normally is used together with gdb...)
>
> I have a small programme that uses the PG'OCaml library.
> This programme is just stress-testing the library: it runs
> an infinite loop, performing the same SELECT statement
> over and over. Now, the byte-code compiled version is
> the archetypical Duracell Bunny: it runs indefinitely
indefinitley sounds very unfine.
I think you meant infinitly?!
> without any problems. The native-code version of the
> same programme, however, crashes with a segmentation fault
> after some 20 minutes.
A very long time... :(
Did you used any self-written external C-stuff?
This could cause problems, if you have written unstable C-code.
>
> I know that OCaml 3.10 supports stack traces of native
> code, but I cannot use 3.10 because PG'OCaml relies on
> the old Camlp4 syntax. I am thus stuck with 3.09.3.
> The solutions I can think of are as follows:
>
> a) Find a way to compile the old Camlp4 syntax with 3.10;
> b) Find a way to debug native-code with 3.09;
> c) Convince Rich to port PG'OCaml to 3.10. 
[...]
Sorry, I don't want to offend the OCaml-core team ( the Guru's )
but as far as I have read in the mails on this list,
OCaml 3.10.0 also has some instability problems?!
Ciao,
Oliver
__._,_.___
.
__,_._,___
|
| Re: "ocaml_beginners"::[]
Byte-code fine; native-code segfaults |
  United States |
2007-08-17 01:54:20 |
|
On Aug 16, 2007, at 6:44 PM, cultural_sublimation wrote:
> Now, the byte-code compiled version is
> the archetypical Duracell Bunny: it runs indefinitely
> without any problems. The native-code version of the
> same programme, however, crashes with a segmentation fault
> after some 20 minutes.
Does your infinite loop use recursion? If so, are you sure it's tail
recursion? In my experience if it segfaults in native code it's
usually a stack overflow. If the bytecode runs "forever", are you
sure that it's getting as far along as the native code, i.e.,
bytecode might need to run for 40 minutes to crash with a
Stack_overflow exception as opposed to 20 minutes for native to crash
with a segfault?
Just a thought,
--Jonathan Bryant
__._,_.___
.
__,_._,___
|
| Re: "ocaml_beginners"::[]
Byte-code fine; native-code segfaults |
  United States |
2007-08-17 02:11:05 |
|
On Friday 17 August 2007 01:54:20 am Jonathan Bryant wrote:
> Does your infinite loop use recursion? If so, are you sure it's tail
> recursion? In my experience if it segfaults in native code it's
> usually a stack overflow. If the bytecode runs "forever", are you
> sure that it's getting as far along as the native code, i.e.,
> bytecode might need to run for 40 minutes to crash with a
> Stack_overflow exception as opposed to 20 minutes for native to crash
> with a segfault?
Also, remember that many list functions, such as List.map, are not tail
recursive so if you are building a very large list and then mapping over it
you could get a stack overflow.
-Dave
__._,_.___
|
| Re: "ocaml_beginners"::[]
Byte-code fine; native-code segfaults |
  United Kingdom |
2007-08-17 03:24:50 |
|
On Fri, Aug 17, 2007 at 01:18:39AM +0200, Oliver Bandel wrote:
> Well, if it is native code, you could let the system write
> coredumps (ulimit -c <max_blocks> ) and use gdb to look at them.
See also: http://et.redhat.com/~rjones/xen-stress-tests/
> You also should be able to run the program inside gdb
> (invoked by gdb).
>
> But I'm not sure how far gdb really can be used here.
gdb works quite well. Of course you won't see OCaml code itself, but
you will see function names which is enough to get a good idea of
where the program is crashing.
> Did you used any self-written external C-stuff?
> This could cause problems, if you have written unstable C-code.
PG'OCaml itself is pure OCaml. It doesn't even use external C
libraries (it avoids libpq for example).
Rich.
--
Richard Jones
Red Hat
__._,_.___
.
__,_._,___
|
| "ocaml_beginners"::[] Re:
Byte-code fine; native-code segfaults |
  United States |
2007-08-17 12:08:12 |
|
Hi,
Thank you guys for all the replies! I suspect the culprit
could be indeed the List.map overflowing the stack,
though I wonder why it doesn't happen immediately: only
after some 10-20 minutes of running. Moreover, though
the byte-code version is about 4x slower, I have left it
running for more than a day, and it has never crashed.
Anyway, here's what the code looks like: (simplified)
class database = object (self)
val conn = PGOCaml.connect ()
method get_stuff () =
let stuff = PGSQL(conn) "SELECT ..." in
List.map record_of_tuple stuff
end
let db = new database in
while true do
let stuff = db#get_stuff () in
Printf.printf "Got %d resultsn%!" (List.length stuff)
done
As you can see, all the main programme does is to
loop forever, invoking the get_stuff method each time.
The function record_of_tuple converts a tuple returned by
the database layer into a proper record. In the tests I am
running, the length of the list is always 500. Would that
be enough to cause a stack overflow with List.map?
Incidentally, if List.map is potentially dangerous, what
are the recommended alternatives?
Thanks again for your help,
C.S.
__._,_.___
|
| "ocaml_beginners"::[] Re:
Byte-code fine; native-code segfaults |
  United States |
2007-08-17 12:19:49 |
|
Hi again,
Incidentally, is there any way to monitor the stack space of a running
programme? The only change I've made was to add a counter to keep
track of the number of cycles the programme went through before
crashing. All I can tell you now is that the number varies on each
execution.
Cheers,
C.S.
__._,_.___
|
| Re: "ocaml_beginners"::[] Re:
Byte-code fine; native-code segfaults |
  United Kingdom |
2007-08-17 14:33:43 |
|
On Fri, Aug 17, 2007 at 05:08:12PM -0000, cultural_sublimation wrote:
> Hi,
>
> Thank you guys for all the replies! I suspect the culprit
> could be indeed the List.map overflowing the stack,
> though I wonder why it doesn't happen immediately: only
> after some 10-20 minutes of running. Moreover, though
> the byte-code version is about 4x slower, I have left it
> running for more than a day, and it has never crashed.
> Anyway, here's what the code looks like: (simplified)
>
>
> class database = object (self)
>
> val conn = PGOCaml.connect ()
>
> method get_stuff () =
> let stuff = PGSQL(conn) "SELECT ..." in
> List.map record_of_tuple stuff
This use of List.map should be safe.
However are you closing the connection? If not then you'll run out of
file descriptors. They are not closed any other way.
> the database layer into a proper record. In the tests I am
> running, the length of the list is always 500. Would that
> be enough to cause a stack overflow with List.map?
No.
> Incidentally, if List.map is potentially dangerous, what
> are the recommended alternatives?
Extlib has a tail-rec List.map.
Rich.
--
Richard Jones
Red Hat
__._,_.___
|
| Re: "ocaml_beginners"::[] Re:
Byte-code fine; native-code segfaults |
  United Kingdom |
2007-08-17 14:34:29 |
|
On Fri, Aug 17, 2007 at 05:19:49PM -0000, cultural_sublimation wrote:
> Hi again,
>
> Incidentally, is there any way to monitor the stack space of a running
> programme? The only change I've made was to add a counter to keep
Take a look at /proc/PID/maps (where PID = Unix.getpid ())
Rich.
--
Richard Jones
Red Hat
__._,_.___
.
__,_._,___
|
| Re: "ocaml_beginners"::[] Re:
Byte-code fine; native-code segfaults |
  Germany |
2007-08-17 15:09:30 |
|
Zitat von Richard Jones < rich%40annexia.org">rich annexia.org>:
> On Fri, Aug 17, 2007 at 05:08:12PM -0000, cultural_sublimation wrote:
> > Hi,
> >
> > Thank you guys for all the replies! I suspect the culprit
> > could be indeed the List.map overflowing the stack,
> > though I wonder why it doesn't happen immediately: only
> > after some 10-20 minutes of running. Moreover, though
> > the byte-code version is about 4x slower, I have left it
> > running for more than a day, and it has never crashed.
> > Anyway, here's what the code looks like: (simplified)
> >
> >
> > class database = object (self)
> >
> > val conn = PGOCaml.connect ()
> >
> > method get_stuff () =
> > let stuff = PGSQL(conn) "SELECT ..." in
> > List.map record_of_tuple stuff
>
> This use of List.map should be safe.
>
> However are you closing the connection? If not then you'll run out of
> file descriptors. They are not closed any other way.
That also was my first thought (but this time I stopped my mail early enough 
and re-thought it... before sending it).
But it seems the Filedescriptor is created once,
when the conn is defined (should be at startup?)!
And: when the system runs out of filedescriptors, this should
give an appropriate exception, but never crash the process with a
SIGSEGV.
>
> > the database layer into a proper record. In the tests I am
> > running, the length of the list is always 500. Would that
> > be enough to cause a stack overflow with List.map?
>
> No.
>
> > Incidentally, if List.map is potentially dangerous, what
> > are the recommended alternatives?
>
> Extlib has a tail-rec List.map.
Ans stdlib has List.rev_map
Ciao,
Oliver
__._,_.___
.
__,_._,___
|
| "ocaml_beginners"::[] Re:
Byte-code fine; native-code segfaults |
  United States |
2007-08-28 15:44:46 |
|
Hi,
I am kind stuck with this problem. Since the new OCaml 3.10 has
support for debugging native code, it has occured to me that perhaps
I should focus on getting the code to compile with 3.10. The problem
is that PG'OCaml uses the old camlp4 syntax, and therefore will not
compile with 3.10. However, after some googling, I came across Camlp5:
http://pauillac.inria.fr/~ddr/camlp5/
The description says that "It is the continuation of the classical camlp4
with new features. It is compatible with OCaml versions from 3.08.1 to
3.11 included". Does this mean I can simply take code that relies on the
old camlp4 syntax, run it through camlp5, and it will compile under 3.10?
Thanks for your help!
C.S.
__._,_.___
.
__,_._,___
|
| Re: "ocaml_beginners"::[] Re:
Byte-code fine; native-code segfaults |
  United Kingdom |
2007-08-28 17:08:38 |
|
On Tue, Aug 28, 2007 at 08:44:46PM -0000, cultural_sublimation wrote:
> Hi,
>
> I am kind stuck with this problem. Since the new OCaml 3.10 has
> support for debugging native code, it has occured to me that perhaps
> I should focus on getting the code to compile with 3.10. The problem
> is that PG'OCaml uses the old camlp4 syntax, and therefore will not
> compile with 3.10. However, after some googling, I came across Camlp5:
> http://pauillac.inria.fr/~ddr/camlp5/
>
> The description says that "It is the continuation of the classical camlp4
> with new features. It is compatible with OCaml versions from 3.08.1 to
> 3.11 included". Does this mean I can simply take code that relies on the
> old camlp4 syntax, run it through camlp5, and it will compile under 3.10?
Yes, with some reservations.
For example we have bundled camlp5 with Fedora (and so have Debian).
However I've found it difficult still to compile some "classical"
camlp4 programs, in my case ast-analyze which is needed by
ocaml-gettext. So AFAICS it seems like even camlp5 is a bit
different. You'll also need to modify your makefiles / ocamlfind
directives so that they refer to camlp5* instead of camlp4*.
Rich.
--
Richard Jones
Red Hat
__._,_.___
.
__,_._,___
|
[1-12]
|
|