List Info

Thread: "thread_get_state failed" + abort in PyLucene




"thread_get_state failed" + abort in PyLucene
user name
2007-05-24 19:21:28
Hi.

I'm working on an application that using Python 2.5,
PyLucene 2.1.0-2 and
Twisted trunk (revision 20059) running under Mac OS X
10.4.9.

The Twisted server is mysteriously and silently dying, in a
fairly
predictable way.  When I stick in debug prints to stderr, I
can see that
(usually) a call to PyLucene.IndexWriter.optimize and
(sometimes) a call to
PyLucene.IndexReader.open never returns.

Following a suggestion from Glyph on the twisted.web mailing
list, I used
ktrace to see if any system calls were going astray. It
doesn't look like
it. But the end of the ktrace (right before the process
dies) reveals:

  7826 Python   CALL  write(0x2,0xbfff971f,0x18)
  7826 Python   GIO   fd 2 wrote 24 bytes
       "thread_get_state failed
       "
  7826 Python   RET   write 24/0x18
  7826 Python   CALL  sigprocmask(0x3,0xbfff9b08,0)
  7826 Python   RET   sigprocmask 0
  7826 Python   CALL  kill(0x1e92,0x6)
  7826 Python   RET   kill 0
  7826 Python   PSIG  SIGABRT SIG_DFL
  7826 Python   NAMI  "/cores/core.7826"

Nothing fails before that.  The "thread_get_state
failedn" message does
not appear in the twisted server log. The kill is an abort
to this process
(i.e., 0x1e92 = 7826). So something (but not a system call)
has gone wrong
and the code has called abort. Which explains why the server
abruptly dies.

Running gdb python /cores/core.7826 provides no useful
info:

  (gdb) where
  #0  0x00000000 in _mh_dylib_header ()

>From the kdump output, the process is clearly in the
middle of doing
PyLucene things (there are a bunch of access calls to files
that are in my
PyLucene index directories).

Grepping for 'thread_get_state failed' in Python, Twisted,
and PyLucene
gets me just one hit:

  $ grep -i 'thread_get_state failed'  /usr/local/lib/*
  Binary file /usr/local/lib/libgcj.6.dylib matches

Which of course is a GCJ library file distributed with the
Mac OS X binary
version of PyLucene.

All of which led me here.  The above kdump output comes from
a failed call
to PyLucene.IndexWriter.optimize. I don't _think_ I'm doing
anything wrong
in my Python - the code in question gets run successfully
multiple times
before it eventually falls over. I'm not using threads.

I'd be happy to get high-level feedback on this. E.g., Is
this a known
issue? Should I go read the Lucene or PyLucene sources to
track it down?
Would it help if I built PyLucene from scratch? Can I help
to fix this? Etc.

Thanks for any help,
Terry
_______________________________________________
pylucene-dev mailing list
pylucene-devosafoundation.org
http://lists.osafoundation.org/mailman/listinfo/pylu
cene-dev

Re: "thread_get_state failed" + abort in PyLucene
country flaguser name
Canada
2007-05-24 20:03:14
Hi Terry. I am curious which reactor are you using for this.
Any threads 
need to be java threads with pylucene. Not sure if this
might work but 
perhaps creating a thread pool with the right kind of
threads could 
help. Somewhere on this list a while back, someone put
together some 
thread pool code that you might find helpful for this.
Twisted also has 
its own thread pool so this would need some modification.
Have you tried 
testing your index outside of twisted to see whether the
issue persists.

Regards,
David

Terry Jones wrote:
> Hi.
> 
> I'm working on an application that using Python 2.5,
PyLucene 2.1.0-2 and
> Twisted trunk (revision 20059) running under Mac OS X
10.4.9.
> 
> The Twisted server is mysteriously and silently dying,
in a fairly
> predictable way.  When I stick in debug prints to
stderr, I can see that
> (usually) a call to PyLucene.IndexWriter.optimize and
(sometimes) a call to
> PyLucene.IndexReader.open never returns.
> 
> Following a suggestion from Glyph on the twisted.web
mailing list, I used
> ktrace to see if any system calls were going astray. It
doesn't look like
> it. But the end of the ktrace (right before the process
dies) reveals:
> 
>   7826 Python   CALL  write(0x2,0xbfff971f,0x18)
>   7826 Python   GIO   fd 2 wrote 24 bytes
>        "thread_get_state failed
>        "
>   7826 Python   RET   write 24/0x18
>   7826 Python   CALL  sigprocmask(0x3,0xbfff9b08,0)
>   7826 Python   RET   sigprocmask 0
>   7826 Python   CALL  kill(0x1e92,0x6)
>   7826 Python   RET   kill 0
>   7826 Python   PSIG  SIGABRT SIG_DFL
>   7826 Python   NAMI  "/cores/core.7826"
> 
> Nothing fails before that.  The "thread_get_state
failedn" message does
> not appear in the twisted server log. The kill is an
abort to this process
> (i.e., 0x1e92 = 7826). So something (but not a system
call) has gone wrong
> and the code has called abort. Which explains why the
server abruptly dies.
> 
> Running gdb python /cores/core.7826 provides no useful
info:
> 
>   (gdb) where
>   #0  0x00000000 in _mh_dylib_header ()
> 
>>From the kdump output, the process is clearly in the
middle of doing
> PyLucene things (there are a bunch of access calls to
files that are in my
> PyLucene index directories).
> 
> Grepping for 'thread_get_state failed' in Python,
Twisted, and PyLucene
> gets me just one hit:
> 
>   $ grep -i 'thread_get_state failed' 
/usr/local/lib/*
>   Binary file /usr/local/lib/libgcj.6.dylib matches
> 
> Which of course is a GCJ library file distributed with
the Mac OS X binary
> version of PyLucene.
> 
> All of which led me here.  The above kdump output comes
from a failed call
> to PyLucene.IndexWriter.optimize. I don't _think_ I'm
doing anything wrong
> in my Python - the code in question gets run
successfully multiple times
> before it eventually falls over. I'm not using
threads.
> 
> I'd be happy to get high-level feedback on this. E.g.,
Is this a known
> issue? Should I go read the Lucene or PyLucene sources
to track it down?
> Would it help if I built PyLucene from scratch? Can I
help to fix this? Etc.
> 
> Thanks for any help,
> Terry
> _______________________________________________
> pylucene-dev mailing list
> pylucene-devosafoundation.org
> http://lists.osafoundation.org/mailman/listinfo/pylu
cene-dev
> 
_______________________________________________
pylucene-dev mailing list
pylucene-devosafoundation.org
http://lists.osafoundation.org/mailman/listinfo/pylu
cene-dev

Re: "thread_get_state failed" + abort in PyLucene
user name
2007-05-24 20:17:00
>>>>> "David" == David Pratt
<fairwindseastlink.ca> writes:
David> Hi Terry. I am curious which reactor are you using
for this. Any
David> threads need to be java threads with pylucene. Not
sure if this
David> might work but perhaps creating a thread pool with
the right kind of
David> threads could help. Somewhere on this list a while
back, someone put
David> together some thread pool code that you might find
helpful for
David> this. Twisted also has its own thread pool so this
would need some
David> modification. Have you tried testing your index
outside of twisted
David> to see whether the issue persists.

Hi David

I'm using poll, though not for any particular reason. Do you
recommend I
try with something else? (I'll try epoll and select and see
if they help).

I'm not using any threads in my code. Are you suggesting I
create an
appropriate thread pool and then get Twisted to use it?

Testing outside of Twisted seems fine. I can't make it fail,
even when I
call the same functions eventually called by the web2
interface, with the
same XML payload it receives, etc. I should have mentioned
that too.

One other thing to mention is that the crash doesn't seem to
happen when I
run twistd with -n (no daemon) option. JP Calderone on the
Twisted list
suggested that PyLucene might be somehow thrown off by the
initial fork of
twistd. That doesn't seem to be the case though, as I
rewrote things to
defer the PyLucene import (using __import__) until after
twistd has
started. So I'm not sure what to make of the crash only
happening when
twistd runs as a daemon.

Thanks for your help.

Terry
_______________________________________________
pylucene-dev mailing list
pylucene-devosafoundation.org
http://lists.osafoundation.org/mailman/listinfo/pylu
cene-dev

Re: "thread_get_state failed" + abort in PyLucene
country flaguser name
United States
2007-05-24 21:41:37
On Fri, 25 May 2007, Terry Jones wrote:

> One other thing to mention is that the crash doesn't
seem to happen when I
> run twistd with -n (no daemon) option. JP Calderone on
the Twisted list
> suggested that PyLucene might be somehow thrown off by
the initial fork of
> twistd. That doesn't seem to be the case though, as I
rewrote things to
> defer the PyLucene import (using __import__) until
after twistd has
> started. So I'm not sure what to make of the crash only
happening when
> twistd runs as a daemon.

I remember someone asking the same question on the javagcc.gnu.org. Running a 
libgcj process after forking it is not supported, really.
See: http
://gcc.gnu.org/ml/java/2007-03/msg00136.html

Andi..
_______________________________________________
pylucene-dev mailing list
pylucene-devosafoundation.org
http://lists.osafoundation.org/mailman/listinfo/pylu
cene-dev

Re: "thread_get_state failed" + abort in PyLucene
country flaguser name
Canada
2007-05-24 22:11:08
Hi Terry. You might want to google nxlucene for hints. I
know they are 
using a twisted app successfully for lucene search with
pyLucene. There 
repository is open to view the code. The search server is
xmlrpc and 
returns rss - hth. I'd like to know how this all turns out.
Can you post 
back to the list if you have success. Many thanks.

Regards,
David

Terry Jones wrote:
>>>>>> "David" == David Pratt
<fairwindseastlink.ca> writes:
> David> Hi Terry. I am curious which reactor are you
using for this. Any
> David> threads need to be java threads with
pylucene. Not sure if this
> David> might work but perhaps creating a thread pool
with the right kind of
> David> threads could help. Somewhere on this list a
while back, someone put
> David> together some thread pool code that you might
find helpful for
> David> this. Twisted also has its own thread pool so
this would need some
> David> modification. Have you tried testing your
index outside of twisted
> David> to see whether the issue persists.
> 
> Hi David
> 
> I'm using poll, though not for any particular reason.
Do you recommend I
> try with something else? (I'll try epoll and select and
see if they help).
> 
> I'm not using any threads in my code. Are you
suggesting I create an
> appropriate thread pool and then get Twisted to use
it?
> 
> Testing outside of Twisted seems fine. I can't make it
fail, even when I
> call the same functions eventually called by the web2
interface, with the
> same XML payload it receives, etc. I should have
mentioned that too.
> 
> One other thing to mention is that the crash doesn't
seem to happen when I
> run twistd with -n (no daemon) option. JP Calderone on
the Twisted list
> suggested that PyLucene might be somehow thrown off by
the initial fork of
> twistd. That doesn't seem to be the case though, as I
rewrote things to
> defer the PyLucene import (using __import__) until
after twistd has
> started. So I'm not sure what to make of the crash only
happening when
> twistd runs as a daemon.
> 
> Thanks for your help.
> 
> Terry
> 
_______________________________________________
pylucene-dev mailing list
pylucene-devosafoundation.org
http://lists.osafoundation.org/mailman/listinfo/pylu
cene-dev

[1-5]

about | contact  Other archives ( Real Estate discussion Medical topics )