List Info

Thread: Re: GDB cannot access memory after Emacs abort




Re: GDB cannot access memory after Emacs abort
country flaguser name
Germany
2007-11-11 17:01:04
On Sat, 10 Nov 2007 22:38:14 -0800 Michael Snyder
<msnyderspecifix.com> wrote:

> Hi Stephen, 
>
> See questions inline:
>
> On Sun, 2007-11-11 at 00:42 +0100, Stephen Berman
wrote:
>> I recently experienced an abort in CVS Emacs and
was unable to get a
>> backtrace from GDB.  The Emacs bug causing the
abort was fixed, but
>> Richard Stallman responded to the lack of a
backtrace with this comment:
>> 
>> > That could be a serious problem in GDB.  It
would be good to talk
>> > with GDB developers about how to investigate
it.  Since the bug's
>> > cause is known, they could focus on figuring
out why GDB fails
>> > to give a backtrace.
>
> Out of curiosity, and because RMS seems to think it's
> relevant -- what is the bug's cause?

It had to do with the handling of named icons in GTK+ tool
bars.  I
don't know the code well enough to say more, but the diff is
here:
http://cvs.sa
vannah.gnu.org/viewvc/emacs/emacs/src/gtkutil.c?r1=1.120&
;r2=1.121&pathrev=MAIN 

>> Here is the GDB-relevant part of my bug report
about the abort (Emacs
>> was built using the GTK+ toolkit):
>
> What's your host architecture?  OS?  How is gdb
configured (host-target
> tuple)?

Linux escher 2.6.22.12-0.1-default i686 athlon i386
GNU/Linux (openSUSE
10.3)

GNU gdb 6.6.50.20070726-cvs
This GDB was configured as "i586-suse-linux".

> Making sure that I understand -- you ran emacs under
gdb, 
> you set a breakpoint at abort, you hit the breakpoint
-- 
> and your desktop is locked up?

Yes (as Eli Zaretskii pointed out, Emacs set a breakpoint at
abort).

> That seems unusual -- do you have any idea of the
cause?

No!  I'm hoping someone here might have some insight.

> Is it possible that emacs is in an infinite recursion
> and has consumed all of virtual memory, or something
> of the sort?

This has happened on (rare) occasion, but it never locked up
the
desktop, I could always at least kill -9 the emacs process
from within
the X window system (in the case under discussion, I was
able to switch
to a virtual tty and kill -9 the emacs process from there,
but X was
locked up solid).

>> >   Cannot access memory at address 0x8321b6c
>
> Is that a valid address for your architecture?

How can I determine that?

Anyway, it sound like you don't suspect a bug in GDB that
prevented
getting a backtrace, or is that still a possibility?

Steve Berman



Re: GDB cannot access memory after Emacs abort
country flaguser name
United States
2007-11-11 23:06:28
On Mon, 2007-11-12 at 00:01 +0100, Stephen Berman wrote:
> On Sat, 10 Nov 2007 22:38:14 -0800 Michael Snyder
<msnyderspecifix.com> wrote:
> 
> > Hi Stephen, 
> >
> > See questions inline:
> >
> > On Sun, 2007-11-11 at 00:42 +0100, Stephen Berman
wrote:
> >> I recently experienced an abort in CVS Emacs
and was unable to get a
> >> backtrace from GDB.  The Emacs bug causing the
abort was fixed, but
> >> Richard Stallman responded to the lack of a
backtrace with this comment:
> >> 
> >> > That could be a serious problem in GDB. 
It would be good to talk
> >> > with GDB developers about how to
investigate it.  Since the bug's
> >> > cause is known, they could focus on
figuring out why GDB fails
> >> > to give a backtrace.
> >
> > Out of curiosity, and because RMS seems to think
it's
> > relevant -- what is the bug's cause?
> 
> It had to do with the handling of named icons in GTK+
tool bars.  I
> don't know the code well enough to say more, but the
diff is here:
> http://cvs.sa
vannah.gnu.org/viewvc/emacs/emacs/src/gtkutil.c?r1=1.120&
;r2=1.121&pathrev=MAIN 

Fair enough -- I think that's out of my depth.  

> >> Here is the GDB-relevant part of my bug report
about the abort (Emacs
> >> was built using the GTK+ toolkit):
> >
> > What's your host architecture?  OS?  How is gdb
configured (host-target
> > tuple)?
> 
> Linux escher 2.6.22.12-0.1-default i686 athlon i386
GNU/Linux (openSUSE
> 10.3)
> 
> GNU gdb 6.6.50.20070726-cvs
> This GDB was configured as
"i586-suse-linux".

Thanks, that's all very helpful.



> 
> > Making sure that I understand -- you ran emacs
under gdb, 
> > you set a breakpoint at abort, you hit the
breakpoint -- 
> > and your desktop is locked up?
> 
> Yes (as Eli Zaretskii pointed out, Emacs set a
breakpoint at abort).
> 
> > That seems unusual -- do you have any idea of the
cause?
> 
> No!  I'm hoping someone here might have some insight.

OK -- just hoping for more info.  

> > Is it possible that emacs is in an infinite
recursion
> > and has consumed all of virtual memory, or
something
> > of the sort?
> 
> This has happened on (rare) occasion, but it never
locked up the
> desktop, I could always at least kill -9 the emacs
process from within
> the X window system (in the case under discussion, I
was able to switch
> to a virtual tty and kill -9 the emacs process from
there, but X was
> locked up solid).

Yeah, that's pretty rare in my experience -- but I'm not
a GUI or X hacker.

Seems like something that shouldn't happen, unless thru a
bug in X.  No client app should be able to take control of
the entire windowing system and prevent anything else from
getting access.

Please note, I'm not at all trying to say "this isn't 
our (gdb's) problem".  

> >> >   Cannot access memory at address
0x8321b6c
> >
> > Is that a valid address for your architecture?
> 
> How can I determine that?

Just thought you might know.  Forget about it.

> Anyway, it sound like you don't suspect a bug in GDB
that prevented
> getting a backtrace, or is that still a possibility?

Oh, I didn't mean to imply that at all.
Just asking for more information.

So just to define the problem ...

1) emacs calls abort.  This is apparently due to a bug
   in emacs, for which you already have a patch.

2) While emacs is held at the abort by gdb, your X system
   is frozen and you can't get any other X window client to
   work.  We don't know the cause of this, but it probably
   isn't gdb, so let's forget about it in this context.

3) Within gdb, when you're at the breakpoint at abort, 
   backtrace doesn't work.  This part is within our domain.

Now 3 could well be a bug in gdb, but there are other
possibilities.  Something could have corrupted the stack, 
so badly that gdb can't unwind it.

Personally I don't see how to decide that question, based
on the information we have so far.  Maybe Daniel and/or Eli
might have an idea?

To pursue it further, we can go one of two ways:

A) maybe you can provide us with enough information and
context to reproduce the problem ourselves?  This seems 
unlikely, but maybe, for instance, you know that with 
a certain released version of emacs and a certain released
version of linux, you can give a fixed sequence of commands
and reliably reproduce the crash?

or 

B) we can keep asking you for more information, question
and answer style.

For instance, I'd like to know the output that you get
from the following gdb commands when you're at the
breakpoint:

i) info registers
ii) info target
iii) x /64x $esp




[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )