|
List Info
Thread: Orderly global destruction
|
|
| Orderly global destruction |
  United States |
2008-02-20 20:43:02 |
Greets,
This post will be of interest only to a few individuals, but
I'm
sending it to the list so that it ends up in the public
record, and
because it is relevant to the RT issue #32689
"KinoSearch::Simple and
DESTROY". It arises out of a phone conversation
between Peter Karman
and I last night.
When you call Devel::Peek: ump($obj
) on a "blessed reference"
returned by the Perl function bless(), you see two SV
structs: an
outer RV, and the inner object SV:
$ perl -MDevel::Peek -e 'my $blank; Dump bless $blank,
"Foo"'
SV = RV(0x81a418) at 0x800d80
REFCNT = 1
FLAGS = (TEMP,ROK)
RV = 0x800e34
SV = PVMG(0x80e440) at 0x800e34
REFCNT = 2
FLAGS = (PADBUSY,PADMY,OBJECT)
IV = 0
NV = 0
PV = 0
STASH = 0x800f0c "Foo"
KinoSearch stores only the inner SV, discarding the outer.
It does
this by first incrementing the refcount of the inner object,
then
decrementing the refcount of the RV wrapper, triggering the
RV's
reclamation without killing off the inner object. It's kind
of
pointless to create and then immediately destroy the outer
RV, but we
have to do things that way because there are no public Perl
C API
functions that allow for the direct creation of an inner
object.
The relevant KS functions are prototyped in
trunk/c_src/KinoSearch/
Obj.h and implemented in trunk/perl/xs/KinoSearch/Obj.c, but
the
following function illustrates what the code would look like
if it
were collected into one place and unrolled for clarity.
We'll use
InvIndex as an example class.
typedef struct InvIndex {
VirtualTable *_;
SV *inner_perl_object;
Schema *schema;
Folder *folder;
} InvIndex;
InvIndex*
InvIndex_new(Schema *schema, Folder *folder)
{
kino_InvIndex *self =
(kino_InvIndex*)malloc(sizeof(kino_InvIndex));
SV *perl_obj_rv = newSV(0); /* soon to be an RV */
/* Turn perl_obj_rv into an RV and create an inner
Perl object,
* which right now only the RV can
"see".
*/
sv_setref_pv(perl_obj_rv,
"KinoSearch::InvIndex", self);
/* Copy a pointer to the inner Perl object into the
KS obj. */
self->inner_perl_obj = SvRV(perl_obj_ref);
/* Increment the refcount of the inner object, so
that it won't
* go away when we decrement the RV.
*/
SvREFCNT_inc(self->inner_perl_obj);
/* Decrement the outer RV. Its refcount falls to
0, and Perl
* frees it. Now [self] holds the sole refcount
keeping
* the inner Perl object alive.
*/
SvREFCNT_dec(perl_obj_rv);
/* Assign the vtable pointer for the InvIndex
class. */
self->_ = (VirtualTable*)&INVINDEX;
/* Increment the refcounts of schema and folder and
copy
pointers
* into the InvIndex object.
*/
self->schema = KINO_REFCOUNT_INC(schema);
self->folder = KINO_REFCOUNT_INC(folder);
return self;
}
There is an esoteric but important reason for doing things
this way.
During Perl global destruction, all objects pointed to by
RVs are
reclaimed in a first sweep. Then later, Perl goes and
sweeps over all
existing SVs, repeatedly decrementing their refcounts until
every last
one gets DESTROYed.
It is essential that all composite KS objects get DESTROYed
during the
*first* sweep.
Consider this snippet, which does nothing but create an
InvIndex object:
use MySchema;
my $invindex = MySchema->open('/path/to/invindex');
Under the current implementation, at the end of that script
there are
three Perl objects in existence, BUT ONLY ONE RV: $invindex,
visible
from Perl space. The first sweep will catch this one RV,
decrementing
its refcount, and trigger an orderly cascade of detructor
calls.
However, if KS were to keep RVs around instead of only the
inner
object, at the end of that script there would be 4 RVs in
existence:
one in Perl space, and three in C space. Because the second
phase of
global destruction is unordered, it's entirely possible that
either
the Schema or the Folder object will get DESTROYed before
the
InvIndex. If that happens, we'll have a nasty double-free
problem:
void
InvIndex_destroy(InvIndex *self)
{
REFCOUNT_DEC(self->schema); /* BAD if
self->schema already
freed */
REFCOUNT_DEC(self->folder);
FREE_OBJ(self);
}
Fortunately, that can't happen the way things are now.
Congratulations if you've fought your way through this
entire
message. If the reasoning seems complicated, well, it sort
of is --
but that's XS for you. Now you know why there is absolutely
no XS any
more in trunk/c_src, where most of KS is implemented.
(Footnote: On the off chance that someone who reads this
message wants
to browse the relevant portions of the Perl source code,
check out
Perl_destruct in perl.c, and sv_clean_objs/sv_clean_all in
sv.c.)
Marvin Humphrey
Rectangular Research
http://www.rectangular.co
m/
Excerpt from a comment in sv.c:
------------------------------------------------------------
--------------
The function visit() scans the SV arenas list, and calls a
specified
function for each SV it finds which is still live - ie which
has an
SvTYPE
other than all 1's, and a non-zero SvREFCNT. visit() is used
by the
following functions (specified as [function that calls
visit()] /
[function
called by visit() for each SV]):
sv_report_used() / do_report_used()
dump all remaining SVs (debugging
aid)
sv_clean_objs() /
do_clean_objs(),do_clean_named_objs()
Attempt to free all objects pointed
to by RVs,
and, unless
DISABLE_DESTRUCTOR_KLUDGE is
defined,
try to do the same for all objects
indirectly
referenced by typeglobs too.
Called once from
perl_destruct(), prior to calling
sv_clean_all()
below.
sv_clean_all() / do_clean_all()
SvREFCNT_dec(sv) each remaining SV,
possibly
triggering an sv_free(). It also
sets the
SVf_BREAK flag on the SV to
indicate that the
refcnt has been artificially
lowered, and thus
stopping sv_free() from giving
spurious
warnings
about SVs which unexpectedly have a
refcnt
of zero. called repeatedly from
perl_destruct()
until there are no SVs left.
_______________________________________________
KinoSearch mailing list
KinoSearch rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch
|
|
| Re: Orderly global destruction |
  United States |
2008-02-20 20:55:05 |
Marvin Humphrey wrote on 2/20/08 8:43 PM:
> Greets,
>
> This post will be of interest only to a few
individuals, but I'm sending
> it to the list so that it ends up in the public record,
and because it
> is relevant to the RT issue #32689
"KinoSearch::Simple and DESTROY". It
> arises out of a phone conversation between Peter Karman
and I last night.
>
Thanks, Marvin. A thorough and well-commented post on the
subject. And glad that
you clarified why the make-a-perl-RV-then-free-it step is
necessary in the
initial object creation. I did not know the details of
Perl's SV cleanup, and
can now see quite clearly why storing only the inner SV in C
space is necessary.
--
Peter Karman . http://peknet.com/ .
peter peknet.com
_______________________________________________
KinoSearch mailing list
KinoSearch rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch
|
|
| Re: Orderly global destruction |

|
2008-02-20 22:04:18 |
On 2/20/08, Marvin Humphrey <marvin rectangular.com> wrote:
> It's kind of
> pointless to create and then immediately destroy the
outer RV, but we
> have to do things that way because there are no public
Perl C API
> functions that allow for the direct creation of an
inner object.
It's been a while since I've done this, but isn't the inner
object
just a SV with the IV field pointing to the C object? If
so, could
you create it directly? I think the exact type of SV
shouldn't matter
much, as everything (I think ) can store an IV.
A little quick web searching seems to show that current
idiom is
something like sv_set_iv(sv, PTR2IV(ptr)). Is the problem
that PTR2IV
isn't public? If so, I'd guess that would have to be just
an
oversight, as traditionally it was just a (IV) cast.
The solution you described seems like it would work fine,
but maybe
the direct creation approach would sidestep the need for
it.
Nathan Kurz
nate verse.com (not very good at lurking)
_______________________________________________
KinoSearch mailing list
KinoSearch rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch
|
|
| Re: Orderly global destruction |
  United States |
2008-02-21 14:18:48 |
On Feb 20, 2008, at 8:04 PM, Nathan Kurz wrote:
> On 2/20/08, Marvin Humphrey <marvin rectangular.com> wrote:
>> It's kind of
>> pointless to create and then immediately destroy
the outer RV, but we
>> have to do things that way because there are no
public Perl C API
>> functions that allow for the direct creation of an
inner object.
>
> It's been a while since I've done this, but isn't the
inner object
> just a SV with the IV field pointing to the C object?
If so, could
> you create it directly? I think the exact type of SV
shouldn't matter
> much, as everything (I think ) can store an IV.
>
> A little quick web searching seems to show that current
idiom is
> something like sv_set_iv(sv, PTR2IV(ptr)). Is the
problem that PTR2IV
> isn't public? If so, I'd guess that would have to be
just an
> oversight, as traditionally it was just a (IV) cast.
PTR2IV is documented in perlguts. It's not in perlapi, but
it should
be. Using it is safe.
However, it's not that simple. These are the changes to
trunk/perl/xs/
KinoSearch/Obj.c that it would really take to switch to
direct creation:
+ /* Code cribbed from Perl_sv_bless, in sv.c. */
+ stash = gv_stashpvn(self->_->class_name,
self->_->class_name_len,
TRUE);
+ inner_obj = newSV(0);
+ SvOBJECT_on(inner_obj);
+ PL_sv_objcount++;
+ SvUPGRADE(inner_obj, SVt_PVMG);
+ SvSTASH_set(inner_obj, (HV*)SvREFCNT_inc(stash));
+ sv_setiv(inner_obj, PTR2IV(self));
+ self->ref.native = inner_obj;
As the comment indicates, the code was derived from
Perl_sv_bless in
sv.c, which gets called via pp_bless in pp.c. It was a PITA
to figure
out what was needed and what wasn't, which is partly why I
hadn't
gotten to this before. Perl_sv_bless has barely changed
from 5.8.3 to
5.10.0, so I think that at least that code would work
smoothly across
all Perl versions supported by KS.
However, these are non-public API invocations:
* SvOBJECT_on is undocumented, but it just sets a flag.
* I have no idea what PL_sv_objcount is for, so I'm
cargo-culting
there.
We also have to modify kino_Doc_to_native, in
trunk/perl/xs/KinoSearch/
Doc.c:
+ HV *stash
+ = gv_stashpvn(self->_->class_name,
self->_->class_name_len,
true);
+ Gv_AMupdate(stash);
+
What that does is turn on overloading. I think it only
needs to be
called once per class, but we have to keep invoking it in
case of
subclassing.
Gv_AMupdate is undocumented. I extracted it from the Gv_AMG
macro,
also undocumented, used by sv_bless:
if (Gv_AMG(stash))
SvAMAGIC_on(sv);
else
(void)SvAMAGIC_off(sv);
Gv_AMG is defined in sv.h:
#define Gv_AMG(stash) (PL_amagic_generation &&
Gv_AMupdate(stash))
From my testing, it looks like as soon as there's any class
that
turns on overloading, PL_amagic_generation turns positive
and stays
positive, causing Gv_AMupdate(stash) to be invoked with each
call to
sv_bless().
I should also mention that KS currently uses the macro
SvAMAGIC_on in
Doc.c, even though it's undocumented. That was what it took
to get
overloading working for Doc under Perl 5.8.x.
> The solution you described seems like it would work
fine, but maybe
> the direct creation approach would sidestep the need
for it.
The complexity resides in the concept. We're only talking
about a few
lines of code, but writing them takes a fairly deep
understanding of
the Perl code base.
The disappointing thing is that after all the work that it
took to
disentangle sv_bless and suss out the direct creation
approach, I see
absolutely no difference in the benchmarks. Creating and
destroying
that extra RV just doesn't seem to matter.
So, we could commit the attached patch implementing direct
creation,
and the conceptual basis of kino_Obj_create would marginally
less
wacky, but to make that happen we have to use a bunch of
undocumented
Perl internal functions. I appreciate the nudge that it
took to get
me to fully explore this path, but it looks like a dead
end.
Marvin Humphrey
Rectangular Research
http://www.rectangular.co
m/
_______________________________________________
KinoSearch mailing list
KinoSearch rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch
|
|
|
| Re: Orderly global destruction |
  United States |
2008-02-21 15:26:47 |
On Feb 21, 2008, at 12:18 PM, Marvin Humphrey wrote:
> The disappointing thing is that after all the work that
it took to
> disentangle sv_bless and suss out the direct creation
approach, I
> see absolutely no difference in the benchmarks.
Creating and
> destroying that extra RV just doesn't seem to matter.
I was able to expose a difference by switching Token to use
KinoSearch::Obj as a base class rather than
KinoSearch::Util::FastObj. With that change, the indexing
benchmark
improved by around 5% using direct creation: 0.86 secs vs.
0.90 secs
for 1000 docs. So at least we know it makes a difference if
you're
creating huge numbers of objects.
However, with Token using FastObj, we're at 0.59 secs
already.
I actually think we can speed indexing up some more by
allocating
Tokens from a MemPool belonging to the TokenBatch. That
would cut
calls to malloc() way way down.
Marvin Humphrey
Rectangular Research
http://www.rectangular.co
m/
_______________________________________________
KinoSearch mailing list
KinoSearch rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch
|
|
| Re: Orderly global destruction |
  South Africa |
2008-02-22 02:35:43 |
> On Feb 21, 2008, at 12:18 PM, Marvin Humphrey wrote:
> With that change, the indexing benchmark
> improved by around 5% using direct creation: 0.86 secs
vs. 0.90 secs
> for 1000 docs. So at least we know it makes a
difference if you're
> creating huge numbers of objects.
>
> However, with Token using FastObj, we're at 0.59 secs
already.
I'm presuming here that "...indexing benchmark improved
by..." means
exactly that.
>From 0.90 to 0.59... that's what, a ~34% improvement?
Wow. From an end-user perspective (which is what I am), and
ignoring
inefficient wrapper code (which is what I have - sorry
Marvin, I know my
code irritates the crap out of you ;), that's a dramatic
improvement.
*Especially* when talking about indexing a TB or more.
Instead of several
weeks to index, this could cut it down to a couple of
weeks...
Good show, keep it up!
Regards
Henry
_______________________________________________
KinoSearch mailing list
KinoSearch rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch
|
|
| Re: Orderly global destruction |

|
2008-02-24 23:20:55 |
On 2/21/08, Marvin Humphrey <marvin rectangular.com> wrote:
> So, we could commit the attached patch implementing
direct creation,
> and the conceptual basis of kino_Obj_create would
marginally less
> wacky, but to make that happen we have to use a bunch
of undocumented
> Perl internal functions. I appreciate the nudge that
it took to get
> me to fully explore this path, but it looks like a
dead end.
Great work on figuring out all the details to make that
work!
I guess my instinct would be to apply the patch for the
slight benefit
in directness and performance, but I agree that the benefit
is
minimal. Still, I think it's a benefit. I guess my view
is that the
kino_Obj C code is working at just about the same level as
the Perl
internals, and thus has the right to do the job directly.
But I can see why you might prefer the better documented
approach.
Thanks for looking into it,
Nathan Kurz
nate verse.com
_______________________________________________
KinoSearch mailing list
KinoSearch rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch
|
|
| Re: Orderly global destruction |
  United States |
2008-02-27 01:50:06 |
On Feb 24, 2008, at 9:20 PM, Nathan Kurz wrote:
> I guess my instinct would be to apply the patch for the
slight benefit
> in directness and performance, but I agree that the
benefit is
> minimal. Still, I think it's a benefit.
OK, fair enough -- I'm swayed. It was tough to explain the
"destroy
the object in order to save it" code flow. This is
more coherent.
The XS isn't any harder to understand than the other XS
functions --
even the documented ones, you wind up having to look up all
the time.
GV_AMupdate etc. are now documented in some expanded
comments, and the
full rationale is preserved in these mailing list archives.
Committed as r3064.
Marvin Humphrey
Rectangular Research
http://www.rectangular.co
m/
_______________________________________________
KinoSearch mailing list
KinoSearch rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch
|
|
| Re: Orderly global destruction |
  United States |
2008-02-28 14:46:04 |
On Feb 24, 2008, at 9:20 PM, Nathan Kurz wrote:
>
>> I guess my instinct would be to apply the patch for
the slight
>> benefit
>> in directness and performance, but I agree that the
benefit is
>> minimal. Still, I think it's a benefit.
A footnote: direct creation had the side effect of wrecking
Doc's
overloading in Perl 5.10.
In 5.8, the SVf_AMAGIC flag is held by the outer ref, and in
5.10,
it's held by the inner object. However, turning it on is
accomplished
via the outer ref in both cases.
Usage:
SvAMAGIC_on(outer_ref);
Macro def in 5.8.8 (from sv.c):
#define SvAMAGIC_on(sv) (SvFLAGS(sv) |= SVf_AMAGIC)
Macro def in 5.10.0:
# define SvAMAGIC_on(sv) ({ SV * const kloink = sv;
assert(SvROK(kloink));
SvFLAGS(SvRV(kloink)) |= SVf_AMAGIC;
})
Under the previous "destroy the object in order to save
it" algo, the
act of creating the outer ref in 5.10 turned on the
SVf_AMAGIC flag on
the inner object. But with direct creating, that wasn't
happening
anymore.
The solution was to modify kino_Doc_to_native, which creates
an outer
ref, to always invoke SvAMAGIC_on(sv). Previously, it had
been set up
to only turn on for 5.8.x.
Marvin Humphrey
Rectangular Research
http://www.rectangular.co
m/
Modified: trunk/perl/xs/KinoSearch/Doc.c
============================================================
=======
--- trunk/perl/xs/KinoSearch/Doc.c 2008-02-28 14:05:42 UTC
(rev 3081)
+++ trunk/perl/xs/KinoSearch/Doc.c 2008-02-28 20:54:47 UTC
(rev 3082)
 -130,12
+130,8 
*/
stash = gv_stashpvn(self->_->class_name,
self->_->class_name_len,
true);
Gv_AMupdate(stash);
-
-#if PERL_VERSION < 9
- /* In perl 5.8, the OVERLOAD magic flag is carried in
the ref,
while in
- * 5.10, it is carried in the inner object. */
SvAMAGIC_on(perl_obj);
-#endif
+
return perl_obj;
}
_______________________________________________
KinoSearch mailing list
KinoSearch rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch
|
|
[1-9]
|
|