|
List Info
Thread: "thread_get_state failed" + abort in PyLucene
|
|
| "thread_get_state failed" +
abort in PyLucene |

|
2007-05-24 19:21:28 |
Hi.
I'm working on an application that using Python 2.5,
PyLucene 2.1.0-2 and
Twisted trunk (revision 20059) running under Mac OS X
10.4.9.
The Twisted server is mysteriously and silently dying, in a
fairly
predictable way. When I stick in debug prints to stderr, I
can see that
(usually) a call to PyLucene.IndexWriter.optimize and
(sometimes) a call to
PyLucene.IndexReader.open never returns.
Following a suggestion from Glyph on the twisted.web mailing
list, I used
ktrace to see if any system calls were going astray. It
doesn't look like
it. But the end of the ktrace (right before the process
dies) reveals:
7826 Python CALL write(0x2,0xbfff971f,0x18)
7826 Python GIO fd 2 wrote 24 bytes
"thread_get_state failed
"
7826 Python RET write 24/0x18
7826 Python CALL sigprocmask(0x3,0xbfff9b08,0)
7826 Python RET sigprocmask 0
7826 Python CALL kill(0x1e92,0x6)
7826 Python RET kill 0
7826 Python PSIG SIGABRT SIG_DFL
7826 Python NAMI "/cores/core.7826"
Nothing fails before that. The "thread_get_state
failedn" message does
not appear in the twisted server log. The kill is an abort
to this process
(i.e., 0x1e92 = 7826). So something (but not a system call)
has gone wrong
and the code has called abort. Which explains why the server
abruptly dies.
Running gdb python /cores/core.7826 provides no useful
info:
(gdb) where
#0 0x00000000 in _mh_dylib_header ()
>From the kdump output, the process is clearly in the
middle of doing
PyLucene things (there are a bunch of access calls to files
that are in my
PyLucene index directories).
Grepping for 'thread_get_state failed' in Python, Twisted,
and PyLucene
gets me just one hit:
$ grep -i 'thread_get_state failed' /usr/local/lib/*
Binary file /usr/local/lib/libgcj.6.dylib matches
Which of course is a GCJ library file distributed with the
Mac OS X binary
version of PyLucene.
All of which led me here. The above kdump output comes from
a failed call
to PyLucene.IndexWriter.optimize. I don't _think_ I'm doing
anything wrong
in my Python - the code in question gets run successfully
multiple times
before it eventually falls over. I'm not using threads.
I'd be happy to get high-level feedback on this. E.g., Is
this a known
issue? Should I go read the Lucene or PyLucene sources to
track it down?
Would it help if I built PyLucene from scratch? Can I help
to fix this? Etc.
Thanks for any help,
Terry
_______________________________________________
pylucene-dev mailing list
pylucene-dev osafoundation.org
http://lists.osafoundation.org/mailman/listinfo/pylu
cene-dev
|
|
| Re: "thread_get_state failed"
+ abort in PyLucene |
  Canada |
2007-05-24 20:03:14 |
Hi Terry. I am curious which reactor are you using for this.
Any threads
need to be java threads with pylucene. Not sure if this
might work but
perhaps creating a thread pool with the right kind of
threads could
help. Somewhere on this list a while back, someone put
together some
thread pool code that you might find helpful for this.
Twisted also has
its own thread pool so this would need some modification.
Have you tried
testing your index outside of twisted to see whether the
issue persists.
Regards,
David
Terry Jones wrote:
> Hi.
>
> I'm working on an application that using Python 2.5,
PyLucene 2.1.0-2 and
> Twisted trunk (revision 20059) running under Mac OS X
10.4.9.
>
> The Twisted server is mysteriously and silently dying,
in a fairly
> predictable way. When I stick in debug prints to
stderr, I can see that
> (usually) a call to PyLucene.IndexWriter.optimize and
(sometimes) a call to
> PyLucene.IndexReader.open never returns.
>
> Following a suggestion from Glyph on the twisted.web
mailing list, I used
> ktrace to see if any system calls were going astray. It
doesn't look like
> it. But the end of the ktrace (right before the process
dies) reveals:
>
> 7826 Python CALL write(0x2,0xbfff971f,0x18)
> 7826 Python GIO fd 2 wrote 24 bytes
> "thread_get_state failed
> "
> 7826 Python RET write 24/0x18
> 7826 Python CALL sigprocmask(0x3,0xbfff9b08,0)
> 7826 Python RET sigprocmask 0
> 7826 Python CALL kill(0x1e92,0x6)
> 7826 Python RET kill 0
> 7826 Python PSIG SIGABRT SIG_DFL
> 7826 Python NAMI "/cores/core.7826"
>
> Nothing fails before that. The "thread_get_state
failedn" message does
> not appear in the twisted server log. The kill is an
abort to this process
> (i.e., 0x1e92 = 7826). So something (but not a system
call) has gone wrong
> and the code has called abort. Which explains why the
server abruptly dies.
>
> Running gdb python /cores/core.7826 provides no useful
info:
>
> (gdb) where
> #0 0x00000000 in _mh_dylib_header ()
>
>>From the kdump output, the process is clearly in the
middle of doing
> PyLucene things (there are a bunch of access calls to
files that are in my
> PyLucene index directories).
>
> Grepping for 'thread_get_state failed' in Python,
Twisted, and PyLucene
> gets me just one hit:
>
> $ grep -i 'thread_get_state failed'
/usr/local/lib/*
> Binary file /usr/local/lib/libgcj.6.dylib matches
>
> Which of course is a GCJ library file distributed with
the Mac OS X binary
> version of PyLucene.
>
> All of which led me here. The above kdump output comes
from a failed call
> to PyLucene.IndexWriter.optimize. I don't _think_ I'm
doing anything wrong
> in my Python - the code in question gets run
successfully multiple times
> before it eventually falls over. I'm not using
threads.
>
> I'd be happy to get high-level feedback on this. E.g.,
Is this a known
> issue? Should I go read the Lucene or PyLucene sources
to track it down?
> Would it help if I built PyLucene from scratch? Can I
help to fix this? Etc.
>
> Thanks for any help,
> Terry
> _______________________________________________
> pylucene-dev mailing list
> pylucene-dev osafoundation.org
> http://lists.osafoundation.org/mailman/listinfo/pylu
cene-dev
>
_______________________________________________
pylucene-dev mailing list
pylucene-dev osafoundation.org
http://lists.osafoundation.org/mailman/listinfo/pylu
cene-dev
|
|
| Re: "thread_get_state failed"
+ abort in PyLucene |

|
2007-05-24 20:17:00 |
>>>>> "David" == David Pratt
<fairwinds eastlink.ca> writes:
David> Hi Terry. I am curious which reactor are you using
for this. Any
David> threads need to be java threads with pylucene. Not
sure if this
David> might work but perhaps creating a thread pool with
the right kind of
David> threads could help. Somewhere on this list a while
back, someone put
David> together some thread pool code that you might find
helpful for
David> this. Twisted also has its own thread pool so this
would need some
David> modification. Have you tried testing your index
outside of twisted
David> to see whether the issue persists.
Hi David
I'm using poll, though not for any particular reason. Do you
recommend I
try with something else? (I'll try epoll and select and see
if they help).
I'm not using any threads in my code. Are you suggesting I
create an
appropriate thread pool and then get Twisted to use it?
Testing outside of Twisted seems fine. I can't make it fail,
even when I
call the same functions eventually called by the web2
interface, with the
same XML payload it receives, etc. I should have mentioned
that too.
One other thing to mention is that the crash doesn't seem to
happen when I
run twistd with -n (no daemon) option. JP Calderone on the
Twisted list
suggested that PyLucene might be somehow thrown off by the
initial fork of
twistd. That doesn't seem to be the case though, as I
rewrote things to
defer the PyLucene import (using __import__) until after
twistd has
started. So I'm not sure what to make of the crash only
happening when
twistd runs as a daemon.
Thanks for your help.
Terry
_______________________________________________
pylucene-dev mailing list
pylucene-dev osafoundation.org
http://lists.osafoundation.org/mailman/listinfo/pylu
cene-dev
|
|
| Re: "thread_get_state failed"
+ abort in PyLucene |
  United States |
2007-05-24 21:41:37 |
On Fri, 25 May 2007, Terry Jones wrote:
> One other thing to mention is that the crash doesn't
seem to happen when I
> run twistd with -n (no daemon) option. JP Calderone on
the Twisted list
> suggested that PyLucene might be somehow thrown off by
the initial fork of
> twistd. That doesn't seem to be the case though, as I
rewrote things to
> defer the PyLucene import (using __import__) until
after twistd has
> started. So I'm not sure what to make of the crash only
happening when
> twistd runs as a daemon.
I remember someone asking the same question on the java gcc.gnu.org. Running a
libgcj process after forking it is not supported, really.
See: http
://gcc.gnu.org/ml/java/2007-03/msg00136.html
Andi..
_______________________________________________
pylucene-dev mailing list
pylucene-dev osafoundation.org
http://lists.osafoundation.org/mailman/listinfo/pylu
cene-dev
|
|
| Re: "thread_get_state failed"
+ abort in PyLucene |
  Canada |
2007-05-24 22:11:08 |
Hi Terry. You might want to google nxlucene for hints. I
know they are
using a twisted app successfully for lucene search with
pyLucene. There
repository is open to view the code. The search server is
xmlrpc and
returns rss - hth. I'd like to know how this all turns out.
Can you post
back to the list if you have success. Many thanks.
Regards,
David
Terry Jones wrote:
>>>>>> "David" == David Pratt
<fairwinds eastlink.ca> writes:
> David> Hi Terry. I am curious which reactor are you
using for this. Any
> David> threads need to be java threads with
pylucene. Not sure if this
> David> might work but perhaps creating a thread pool
with the right kind of
> David> threads could help. Somewhere on this list a
while back, someone put
> David> together some thread pool code that you might
find helpful for
> David> this. Twisted also has its own thread pool so
this would need some
> David> modification. Have you tried testing your
index outside of twisted
> David> to see whether the issue persists.
>
> Hi David
>
> I'm using poll, though not for any particular reason.
Do you recommend I
> try with something else? (I'll try epoll and select and
see if they help).
>
> I'm not using any threads in my code. Are you
suggesting I create an
> appropriate thread pool and then get Twisted to use
it?
>
> Testing outside of Twisted seems fine. I can't make it
fail, even when I
> call the same functions eventually called by the web2
interface, with the
> same XML payload it receives, etc. I should have
mentioned that too.
>
> One other thing to mention is that the crash doesn't
seem to happen when I
> run twistd with -n (no daemon) option. JP Calderone on
the Twisted list
> suggested that PyLucene might be somehow thrown off by
the initial fork of
> twistd. That doesn't seem to be the case though, as I
rewrote things to
> defer the PyLucene import (using __import__) until
after twistd has
> started. So I'm not sure what to make of the crash only
happening when
> twistd runs as a daemon.
>
> Thanks for your help.
>
> Terry
>
_______________________________________________
pylucene-dev mailing list
pylucene-dev osafoundation.org
http://lists.osafoundation.org/mailman/listinfo/pylu
cene-dev
|
|
[1-5]
|
|