|
List Info
Thread: linux-thread-db.c not only caller of add_thread, -> gdb segv
|
|
| linux-thread-db.c not only caller of
add_thread, -> gdb segv |

|
2007-11-08 22:38:13 |
Hi. I'm trying to decide what's the best way to fix a bug
I've found.
linux-thread-db.c is not the only caller of add_thread, e.g.
infrun.c
calls it too.
linux-thread-db.c assumes thread_info.private is non-NULL,
e.g. here:
static int
clear_lwpid_callback (struct thread_info *thread, void
*dummy)
{
/* If we know that our thread implementation is 1-to-1, we
could
save
a certain amount of information; it's not clear how
much, so we
are always conservative. */
thread->private->th_valid = 0;
thread->private->ti_valid = 0;
return 0;
}
called from here:
#0 0x00000000004681a4 in clear_lwpid_callback
(thread=0xb132d0,
dummy=0x0) at ../../src/gdb/linux-thread-db.c:765
#1 0x00000000004f8480 in iterate_over_threads
(callback=0x46818d
<clear_lwpid_callback>, data=0x0) at
../../src/gdb/thread.c:204
#2 0x00000000004682a6 in thread_db_resume (ptid={pid = -1,
lwp = 0,
tid = 0}, step=0, signo=TARGET_SIGNAL_0) at
../../src/gdb/linux-thread-db.c:782
#3 0x00000000004eb83d in resume (step=0,
sig=TARGET_SIGNAL_0) at
../../src/gdb/infrun.c:614
#4 0x00000000004ebc14 in proceed
(addr=18446744073709551615,
siggnal=TARGET_SIGNAL_DEFAULT, step=0) at
../../src/gdb/infrun.c:805
This is iterating over all threads, some of which were
created via
add_thread calls outside of linux-thread-db.c, and thus
thread_info.private is NULL and boom, segv.
I can see two solutions
1) ensure all calls to add_thread properly initialize
thread_info.private
(via callback or whatever)
2) have linux-thread-db.c always check thread_info.private
before
dereferencing it.
Comments?
|
|
| Re: linux-thread-db.c not only caller of
add_thread, -> gdb segv |
  United States |
2007-11-09 08:02:25 |
On Thu, Nov 08, 2007 at 08:38:13PM -0800, Douglas Evans
wrote:
> Hi. I'm trying to decide what's the best way to fix a
bug I've found.
> linux-thread-db.c is not the only caller of add_thread,
e.g. infrun.c
> calls it too.
This is true. But if you're using the Linux native target,
and reach
the point where any of the other add_thread calls are made,
then
there's already a bug. It is supposed to detect all new
threads.
It sort of has to - that's how infrun gets the new ptid_t.
> I can see two solutions
> 1) ensure all calls to add_thread properly initialize
thread_info.private
> (via callback or whatever)
> 2) have linux-thread-db.c always check
thread_info.private before
> dereferencing it.
3) Figure out how you got to any of the other add_thread
calls.
--
Daniel Jacobowitz
CodeSourcery
|
|
| Re: linux-thread-db.c not only caller of
add_thread, -> gdb segv |

|
2007-11-09 19:26:56 |
On Nov 9, 2007 6:02 AM, Daniel Jacobowitz <drow false.org> wrote:
> This is true. But if you're using the Linux native
target, and reach
> the point where any of the other add_thread calls are
made, then
> there's already a bug. It is supposed to detect all
new threads.
> It sort of has to - that's how infrun gets the new
ptid_t.
Ah, thanks. I understand the threading support a bit better
now.
[for completeness' sake, target = x86_64-linux]
> 3) Figure out how you got to any of the other
add_thread calls.
Or that.
I think I understand what's going on. Consider a threaded
app that
first exec's itself
(e.g. because it wants to be run with a specific value for
LD_LIBRARY_PATH,
the actual reason doesn't matter though).
GDB doesn't properly handle the fact that the original
process is gone.
GDB detects the exec and removes thread_db_ops from the
target stack.
But (a) it doesn't clear out thread_list and (b) still
thinks it has control of
the running process.
By the time GDB gets to handle_inferior_event() case
TARGET_WAITKIND_EXECD
it has added the new process to thread_list with the call
to add_thread in infrun.c, and thus there's an entry in
thread_list with
"private" == NULL. thread_list also has stale
entries at this point.
If the user does a ^c after the exec and then "info
threads"
gdb will detect an internal error due to the stale entries
in thread_list
and crash.
But if instead the user does a ^c and then "sharedlib
some_library"
GDB will push thread_db_ops back on the target stack.
If the user then does "continue" thread_db_resume
gets called and boom,
segv because thread->private == NULL in
clear_lwpid_callback.
I'm working on a patch but I don't know this part of GDB
very well.
[small test case available, I'll include it with the patch]
|
|
| Re: linux-thread-db.c not only caller of
add_thread, -> gdb segv |
  United States |
2007-11-09 21:25:58 |
On Fri, Nov 09, 2007 at 05:26:56PM -0800, Douglas Evans
wrote:
> GDB detects the exec and removes thread_db_ops from the
target stack.
> But (a) it doesn't clear out thread_list and (b) still
thinks it has control of
> the running process.
(b) should be true, no?
Exec support is somewhat horrible. I posted a patch for it
recently
which might help (might not, I'm not sure).
--
Daniel Jacobowitz
CodeSourcery
|
|
| Re: linux-thread-db.c not only caller of
add_thread, -> gdb segv |

|
2007-11-09 21:40:32 |
On Nov 9, 2007 7:25 PM, Daniel Jacobowitz <drow false.org> wrote:
> On Fri, Nov 09, 2007 at 05:26:56PM -0800, Douglas Evans
wrote:
> > GDB detects the exec and removes thread_db_ops
from the target stack.
> > But (a) it doesn't clear out thread_list and (b)
still thinks it has control of
> > the running process.
>
> (b) should be true, no?
I suspect it's a matter of degrees (so to speak) or word
choice (apologies).
Until MAY_FOLLOW_EXEC is true for linux I'd expect gdb to
return control
to the user when an exec() happens.
Am I wrong in thinking gdb will lose control across the
exec()?
> Exec support is somewhat horrible. I posted a patch
for it recently
> which might help (might not, I'm not sure).
I'll try to find it and give it a try. Thanks.
|
|
| Re: linux-thread-db.c not only caller of
add_thread, -> gdb segv |

|
2007-11-09 23:20:25 |
On Nov 9, 2007 7:40 PM, Douglas Evans <dje google.com> wrote:
> I suspect it's a matter of degrees (so to speak) or
word choice (apologies).
> Until MAY_FOLLOW_EXEC is true for linux I'd expect gdb
to return control
> to the user when an exec() happens.
> Am I wrong in thinking gdb will lose control across the
exec()?
I guess what gdb does for an exec() when MAY_FOLLOW_EXEC==0
is at
least partially a matter of taste or definition. Running
some
experiments I can see that I can control (e.g. single step)
the "new"
process after the exec but the symbol info is all wrong if I
exec() a
different program.
It didn't take much to get gdb to follow the exec'ing of toy
programs.
So setting aside a discussion of what MAY_FOLLOW_EXEC==0
really means
on linux, how much work remains to support
MAY_FOLLOW_EXEC==1?
With this patch I can step across the exec from test-exec1.c
to test-exec2.c.
IWBN if gdb remembered the original program so that if I run
it again,
gdb reruns test-exec1.x and not test-exec2.x.
This is not a real patch, just an experiment. I had to
comment out
the call to target_mourn_inferior to avoid a hang in gdb
(and even a
hang in gdb of gdb). Perhaps some aspect of mourning is
required
here, but just not a full-blown one.
Any ideas what it would take to properly support
MAY_FOLLOW_EXEC==1?
|
|
|
| Re: linux-thread-db.c not only caller of
add_thread, -> gdb segv |
  United States |
2007-11-10 10:00:52 |
On Fri, Nov 09, 2007 at 09:20:25PM -0800, Douglas Evans
wrote:
> On Nov 9, 2007 7:40 PM, Douglas Evans <dje google.com> wrote:
> > I suspect it's a matter of degrees (so to speak)
or word choice (apologies).
> > Until MAY_FOLLOW_EXEC is true for linux I'd expect
gdb to return control
> > to the user when an exec() happens.
> > Am I wrong in thinking gdb will lose control
across the exec()?
>
> I guess what gdb does for an exec() when
MAY_FOLLOW_EXEC==0 is at
> least partially a matter of taste or definition.
Running some
> experiments I can see that I can control (e.g. single
step) the "new"
> process after the exec but the symbol info is all wrong
if I exec() a
> different program.
Right. Thus the patch I mentioned (sorry, should have found
the link
yesterday):
http://sourceware.org/ml/gdb-patches/2007-10/msg00455.
html
> With this patch I can step across the exec from
test-exec1.c to test-exec2.c.
> IWBN if gdb remembered the original program so that if
I run it again,
> gdb reruns test-exec1.x and not test-exec2.x.
Yeah. I haven't figured out what to do about that, or all
the various
implications... for instance, if you say "file"
should that override
the program to run the next time? So I've just fixed it as
it was and
not messed with the dubious UI. There may be more to come
on this topic...
--
Daniel Jacobowitz
CodeSourcery
|
|
[1-7]
|
|