|
List Info
Thread: Associating C++ new with constructor
|
|
| Associating C++ new with constructor |

|
2007-02-15 01:35:16 |
This is NOT about the problems setting a breakpoint on a
C++
constructor!
I've been looking for a way to count creations and
destructions of C++
objects, with the counts kept on a per class basis. I've
looked at
various tools to track memory use, but none of them appear
to provide
this directly (even when attention is limited to the free
store).
Some capture the call stack when the call to new is made,
but this
doesn't definitively identify the class in question. (It
can provide
the line of the call, but that's an imperfect indicator. If
there are
multiple calls on one line, or the call is split over
several lines,
things are ugly. And if we have
template<typename T> class Foo {
.....
T* p = new T(); // maybe needs to be new typename T()?
.....
};
the source is insufficient to identify the type.)
The problem is that the call to new precedes the call to
the
constructor; new doesn't know the type, and the type is not
on the
call stack.
Might it be possible, programmatically, to trace the stack
up, locate
the call that will be made next, and get the constructor
that way?
Here's a toy example on linux-i386:
Source line is
A *pA = new A();
disassembly
0x08048c9c <main+86>: movl $0x2c,0xc(%esp)
0x08048ca4 <main+94>: movl $0x804910d,0x8(%esp)
0x08048cac <main+102>: movl $0x8049166,0x4(%esp)
0x08048cb4 <main+110>: movl $0x4,(%esp)
0x08048cbb <main+117>: call 0x8048eac
<_ZnwjPKcS0_m>
0x08048cc0 <main+122>: mov %eax,%ebx
0x08048cc2 <main+124>: mov %ebx,(%esp)
0x08048cc5 <main+127>: call 0x8048e9e <A>
0x08048cca <main+132>: mov %ebx,0xfffffff4(%ebp)
The call stack has main+122 on it. Adding 5 gets me to the
next call,
and
(gdb) info symbol 0x8048e9e
A::A() in section .text
(which I suppose is what the <A> annotation in the
disassembly was
about).
So my recipe is
a) get the address of the caller
b) add 5
c) extract the location being called there
d) get the symbol being called, and use it to identify the
class
(this last step involves several substeps, including notably
doing the
disassemly, getting the symbol, unmangling the symbol, and
extracting
the class name).
Does that have any chance of working with any generality?
I'm not necessarily looking to do the instrumentation from
within gdb,
though I'd certainly like to avoid having to redo the logic
of getting
debug info, etc..
The most obvious concern is that calls to the c'tor might
get
optimized away (for example, class A above is empty, but I
compiled
with g++ -O0).
It would also be nice to get the address of new
automatically. Can
anyone explain why this symbolic lookup failed? (the
program wasn't
running)
(gdb) info symb 0x8048eac # works
operator new(unsigned int, char const*, char const*,
unsigned long) in section .text
(gdb) info add 'operator new(unsigned int, char const*, char
const*,
unsigned long)' # fails
No symbol "'operator new(unsigned int, char const*,
char const*, unsigned long)'" in current context.
I tried several variants of the name for new; none worked.
Incidentally, it would be nice to be able to get all c'tor
calls, not
just those associated with the heap, but that obviously does
run into
the problems putting breakpoints on them. It would also
require
identifying all the c'tors.
Thanks.
Ross Boylan
|
|
| Re: Associating C++ new with constructor |
  United States |
2007-02-15 05:50:14 |
On Wed, Feb 14, 2007 at 11:35:16PM -0800, Ross Boylan
wrote:
> Does that have any chance of working with any
generality?
No, none. The compiler can do an arbitrary amount of work
between the
two calls. They can end up with branches in between, or a
call to new
could be used to conditionally create one of two objects
that happen
to have the same size.
--
Daniel Jacobowitz
CodeSourcery
|
|
| Re: Associating C++ new with constructor |
  Israel |
2007-02-15 07:35:14 |
Ross Boylan wrote:
> This is NOT about the problems setting a breakpoint on
a C++
> constructor!
>
> I've been looking for a way to count creations and
destructions of C++
> objects, with the counts kept on a per class basis.
Many (9?) years ago I have written a dumb perl script to do
this task.
It is
still being used from time to time, despite valgrind's
massif.
I can't give you the script, without considerable paperwork,
since
it is (C) IBM (and I won't try because it is too
embarrassing to
release such a trivial script).
The idea is simple, and took me less than a day to
implement:
Define a template class that can monitor object creation
and
destruction.
Insert a member of this type into all non-POD classes:
class MyFoo {
....
....
...
Count<MyFoo> m_count_for_MyFoo;
};
This member will be constructed every time your MyFoo is
constructed. Inside the Count class you can do whatever
you want, every Count<T> class may update a global
registry, from which you can print statistics for all Count
classes.
Note that the perl script simply detects "class
....." and
inserts the members into it. The nice thing about this
template thing, is that it should work well even
if your MyFoo is also templated, and it works
well for all possible constructors.
It is imperfect since it can't reliably detect all
non-POD classes, and it does not work for std: **
classes. To make it 100% reliable, I had to write
a complete C++ parser!.
--
Michael Veksler
http:///tx.techni
on.ac.il/~mveksler
|
|
| Re: Associating C++ new with constructor |

|
2007-02-15 12:26:25 |
Daniel, thanks for your response. I've been trying to avoid
messing
with the source code, but it sounds as if I'm stuck with
it.
I suppose one other option would be go to the source line of
the call
to new and try to pull the class out of it. It might work
well enough
to do the job.
I was thinking of tools to rewrite the source automatically,
and
Michael provides one suggestion below. I have a few
questions about
what he wrote.
Before I get to that, though, can anyone explain why gdb was
able to
give me the name of the new() function given the address,
but couldn't
give me the address when given the name of the function
(details in
the original post)?
On Thu, Feb 15, 2007 at 03:35:14PM +0200, Michael Veksler
wrote:
> Ross Boylan wrote:
> >This is NOT about the problems setting a breakpoint
on a C++
> >constructor!
> >
> >I've been looking for a way to count creations and
destructions of C++
> >objects, with the counts kept on a per class basis.
> Many (9?) years ago I have written a dumb perl script
to do this task.
> It is
> still being used from time to time, despite valgrind's
massif.
> I can't give you the script, without considerable
paperwork, since
> it is (C) IBM (and I won't try because it is too
embarrassing to
> release such a trivial script).
No problem.
>
> The idea is simple, and took me less than a day to
implement:
> Define a template class that can monitor object
creation and
> destruction.
> Insert a member of this type into all non-POD classes:
> class MyFoo {
> ....
> ....
> ...
> Count<MyFoo> m_count_for_MyFoo;
> };
>
> This member will be constructed every time your MyFoo
is
> constructed. Inside the Count class you can do
whatever
> you want, every Count<T> class may update a
global
> registry, from which you can print statistics for all
Count
> classes.
One nice thing about this vs. the memory corruption
detection tools is
that it catches all instance creation, not just stuff on the
heap.
>
> Note that the perl script simply detects "class
....." and
> inserts the members into it. The nice thing about this
> template thing, is that it should work well even
> if your MyFoo is also templated, and it works
> well for all possible constructors.
>
> It is imperfect since it can't reliably detect all
> non-POD classes, and it does not work for std: **
> classes. To make it 100% reliable, I had to write
> a complete C++ parser!.
Why isn't this reliable for all classes in your source code
(at least
if you add a search for "struct")? Are you
referring to cases that
use pre-processor magic, or is there something else?
Another approach is in libcwd, which records the class along
with the
memory allocation if you insert a call to AllocTag() after
the call to
new. AllocTag uses templates under the hood to get the type
of the
pointer automatically, and libcwd provides a macro NEW that
will take
care of it automatically, e.g., A *pA = NEW(A()).
This problem was driving me crazy enough that I was looking
into
putting hooks in gcc, which has the advantage of providing a
C++
parser. That approach has a few disadvantages. Aside from
being a
big project (at least for someone like myself who knows
little about
compilers), it would be impractical to redistribute.
There seems to be some possibility that the upcoming
revision of the
C++ standard will incorporate improved reflection
capabilities; that
might help with this problem.
Ross
|
|
| Re: Associating C++ new with constructor |
  Israel |
2007-02-15 14:29:25 |
Ross Boylan wrote:
...
> On Thu, Feb 15, 2007 at 03:35:14PM +0200, Michael
Veksler wrote:
>
>> The idea is simple, and took me less than a day to
implement:
>> Define a template class that can monitor object
creation and
>> destruction.
>> Insert a member of this type into all non-POD
classes:
>> class MyFoo {
>> ....
>> ....
>> ...
>> Count<MyFoo> m_count_for_MyFoo;
>> };
>>
The difficulty with this, is that it will work reliably only
for
non-POD classes. For POD (Plain Old Data) classes or
structs, you are not allowed to automatically modify them,
because they might participate in code such as
struct A {
int a, b;
};
void Serialize(ostream & out, const A& a) {
out.write(reinterpret_cast<const char*>(&a),
sizeof(a));
}
This is well defined only if A is a POD, if you add a Count
instance then the program becomes:
1. undefined
2. outputs wrong stuff to the stream.
> One nice thing about this vs. the memory corruption
detection tools is
> that it catches all instance creation, not just stuff
on the heap.
>
And that's one of the reason it is still used used despite
of other tools.
>> It is imperfect since it can't reliably detect all
>> non-POD classes, and it does not work for std: **
>> classes. To make it 100% reliable, I had to write
>> a complete C++ parser!.
>>
>
> Why isn't this reliable for all classes in your source
code (at least
> if you add a search for "struct")? Are you
referring to cases that
> use pre-processor magic, or is there something else?
>
Read my comment above about the importance of non-POD
checks.
There is no preprocessor magic, only a trivial perl script
that inserts
instances into non-POD classes. The inaccuracy comes from
1. It is difficult to find all definitions of all
classes. You should
ignore
occurrences of the "class" word in strings
and in /* */ comments.
You should find "class" words expanded by
some user macros.
This is virtually impossible to get right without at
least running
a preprocessor.
2. It is difficult to be certain that some class is a
non-POD class,
since its POD-ness depends also on its parents and
members.
Also, user's macros may hide stuff, like virtual
methods.
Templates make it even more difficult.
I simply made some dumb assumptions in my script that all
"class"
definitions are non-POD (or at least never used as POD),
and
all "struct" definitions are suspected to be POD.
Fortunately,
this assumption is correct most of the time due to common
coding conventions.
> Another approach is in libcwd, which records the class
along with the
> memory allocation if you insert a call to AllocTag()
after the call to
> new. AllocTag uses templates under the hood to get the
type of the
> pointer automatically, and libcwd provides a macro NEW
that will take
> care of it automatically, e.g., A *pA = NEW(A()).
>
So you have to write your code this way, or write a
preprocessor script
to modify
A *pA= new A(new B, new C[5]));
to
A *pA= NEW(NEW(B), NEW_ARRAY(C, 5));
(I just guess how new over an array are done, I don't really
use libcwd).
> This problem was driving me crazy enough that I was
looking into
> putting hooks in gcc, which has the advantage of
providing a C++
> parser. That approach has a few disadvantages. Aside
from being a
> big project (at least for someone like myself who knows
little about
> compilers), it would be impractical to redistribute.
>
I don't think there should be any problem to redistribute
such a tool.
I know of at least one such project.
--
Michael Veksler
http:///tx.techni
on.ac.il/~mveksler
|
|
| Re: Associating C++ new with constructor |
  United States |
2007-02-15 16:10:02 |
On Thu, 2007-02-15 at 22:29 +0200, Michael Veksler wrote:
> Ross Boylan wrote:
> ...
> > On Thu, Feb 15, 2007 at 03:35:14PM +0200, Michael
Veksler wrote:
> >
> >> The idea is simple, and took me less than a
day to implement:
> >> Define a template class that can monitor
object creation and
> >> destruction.
> >> Insert a member of this type into all non-POD
classes:
> >> class MyFoo {
> >> ....
> >> ....
> >> ...
> >> Count<MyFoo> m_count_for_MyFoo;
> >> };
> >>
> The difficulty with this, is that it will work reliably
only for
> non-POD classes. For POD (Plain Old Data) classes or
> structs, you are not allowed to automatically modify
them,
> because they might participate in code such as
> struct A {
> int a, b;
> };
> void Serialize(ostream & out, const A& a) {
> out.write(reinterpret_cast<const char*>(&a),
sizeof(a));
> }
>
> This is well defined only if A is a POD, if you add a
Count
> instance then the program becomes:
> 1. undefined
> 2. outputs wrong stuff to the stream.
I'm not following. Do you mean that the C++ standard doesn't
like the
code, or just that streaming out (and back in) an arbitrary
object (the
counter instance) is unlikely to work, along with any other
operations
that treat A as a bucket of bits (e.g., memmove)?
....
> >> It is imperfect since it can't reliably detect
all
> >> non-POD classes, and it does not work for
std: **
> >> classes. To make it 100% reliable, I had to
write
> >> a complete C++ parser!.
> >>
> >
> > Why isn't this reliable for all classes in your
source code (at least
> > if you add a search for "struct")? Are
you referring to cases that
> > use pre-processor magic, or is there something
else?
> >
> Read my comment above about the importance of non-POD
checks.
> There is no preprocessor magic, only a trivial perl
script that inserts
> instances into non-POD classes. The inaccuracy comes
from
>
> 1. It is difficult to find all definitions of all
classes. You should
> ignore
> occurrences of the "class" word in
strings and in /* */ comments.
> You should find "class" words expanded
by some user macros.
That's what I meant by my reference to "preprocessor
magic" above.
> This is virtually impossible to get right without
at least running
> a preprocessor.
> 2. It is difficult to be certain that some class is
a non-POD class,
> since its POD-ness depends also on its parents
and members.
> Also, user's macros may hide stuff, like virtual
methods.
> Templates make it even more difficult.
>
> I simply made some dumb assumptions in my script that
all "class"
> definitions are non-POD (or at least never used as
POD), and
> all "struct" definitions are suspected to be
POD. Fortunately,
> this assumption is correct most of the time due to
common
> coding conventions.
> > Another approach is in libcwd, which records the
class along with the
> > memory allocation if you insert a call to
AllocTag() after the call to
> > new. AllocTag uses templates under the hood to
get the type of the
> > pointer automatically, and libcwd provides a macro
NEW that will take
> > care of it automatically, e.g., A *pA = NEW(A()).
> >
> So you have to write your code this way, or write a
preprocessor script
> to modify
>
> A *pA= new A(new B, new C[5]));
>
> to
>
> A *pA= NEW(NEW(B), NEW_ARRAY(C, 5));
Yes (I'm not sure above the array syntax either). Actually,
nesting the
NEWs might blow up, since the macro expands to a couple of
statements (I
think--I guess it could use a "," operator).
>
> (I just guess how new over an array are done, I don't
really use libcwd).
> > This problem was driving me crazy enough that I
was looking into
> > putting hooks in gcc, which has the advantage of
providing a C++
> > parser. That approach has a few disadvantages.
Aside from being a
> > big project (at least for someone like myself who
knows little about
> > compilers), it would be impractical to
redistribute.
> >
> I don't think there should be any problem to
redistribute such a tool.
> I know of at least one such project.
If my instructions for building the test suite start with
"build a
custom compiler with this patch set" that is likely to
be off-putting.
In fact, it's off-putting for me
There's a project to make it easier to create addons to gcc
(GEM), but
since it's not in the main distribution one has to build a
custom
compiler to use an addon anyway.
--
Ross Boylan wk: (415)
514-8146
185 Berry St #5700 ross biostat.ucsf.edu
Dept of Epidemiology and Biostatistics fax: (415)
514-8150
University of California, San Francisco
San Francisco, CA 94107-1739 hm: (415)
550-1062
|
|
[1-6]
|
|