List Info

Thread: Theoretical question




Theoretical question
user name
2006-01-17 04:50:11
I've been reading the docs on the internal construction of
Xapian. There's
discussion of autopruning and operator decay in the Matching
section.

Elsewhere, though, it says that postings lists are stored in
doc_id order,
instead of wdf order, which suggests that there could be
high-ranking
documents at the end of a postings list.

How can autoprune and operator decay really have much
effect, then? You
would almost always have to go to the end of every list.

Example: let's say we have 1000 documents, and we need to
return the top 10
for a single-word query. On average, the top 10 will be
scattered uniformly
across a postings list which is sorted in doc_id order,
which means that at
least one of them will commonly be found 90% or 95% of the
way into the
list.




_______________________________________________
Xapian-discuss mailing list
Xapian-discusslists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-discuss
[1]

about | contact  Other archives ( Real Estate discussion Medical topics )