On Mon, Apr 09, 2007 at 12:01:57PM +0100, Richard Boulton
wrote:
> Olly Betts wrote:
> >Argh! That issue has bitten us at least once
before. I'll audit all
> >uses of insert to see if there are any other
instances, unless you
> >already have/are intending to.
>
> No, I've not done this and hadn't planned to do so, but
it's an
> excellent plan.
I'm looking now. There are a couple in the queryparser
which look
suspect, otherwise it's OK so far.
> I was thinking about working on the xapian-check to
make it check
> document lengths in the postlists for consistency.
If you check a whole database, it already checks the
doclength in the
postlist against that in the termlist, and that's checked
against the
sum of the wdfs in the termlist.
If you mean when only checking a postlist it should sum wdf
across
postlists to check the doclength, then the problem is that's
going to be
expensive (it's essentially uninverting the inverted file)
and I suspect
it would be quicker to just check the whole database (or add
a "postlist
and termlist only" mode).
We don't currently consistency check the wdfs in the
postlist against
those in the termlist, but we do sum them to check that they
total to
the collection frequency.
Cheers,
Olly
_______________________________________________
Xapian-devel mailing list
Xapian-devel lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
|