List Info

Thread: Re:




Re:
country flaguser name
United Kingdom
2007-05-10 23:23:43
On Thu, May 10, 2007 at 05:53:19PM -0700, Kevin Duraj
wrote:
> I want the top speed during indexing and searches, and
I do not care about
> smallest database. I think most of users feel the same.
If "gzip -9" makes
> the indexing slightly slower, remove it. *smile* 

The thing is that smaller is often faster.  Once I/O becomes
the
limiting factor, compression will speed things up.  CPU
speeds have
increased faster than storage speeds over time, so this is
likely to
be more true than it ever was!

It's probably not worth trying to squeeze out every single
last byte,
but without benchmarking different compression levels and
thresholds
for when to try compressing, we have little idea what a good
setting
actually is.

Cheers,
    Olly

_______________________________________________
Xapian-discuss mailing list
Xapian-discusslists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-discuss

Re:
country flaguser name
United Kingdom
2007-05-11 05:55:00
On Fri, May 11, 2007 at 05:23:43AM +0100, Olly Betts wrote:

> > I want the top speed during indexing and searches,
and I do not care about
> > smallest database. I think most of users feel the
same. If "gzip -9" makes
> > the indexing slightly slower, remove it. *smile*

> 
> The thing is that smaller is often faster.  Once I/O
becomes the
> limiting factor, compression will speed things up.  CPU
speeds have
> increased faster than storage speeds over time, so this
is likely to
> be more true than it ever was!

This is hugely important, and is something that a lot of
people
miss. It doesn't make a huge amount of difference when
you're dealing
with small data sets (say, less than half the size of core),
but then
the delta cost should be fairly minimal. Once you get into
moderately
large data sets (say two to four times core), you're going
to start
hurting very badly if you're wasting time transferring data
suboptimally (*). Even if you can stack enough disks to get
maximum
fibre speed, you're still only managing a few gig per
second; given
your core will be a minimum of 8G these days, cutting down
your
storage size becomes really important. (And that's assuming
that only
one machine has access to the fabric, when it's more likely
to be
shared...)

David Braben has an interesting graph that backs this up
(admittedly
from the point of view of consoles). It's *more* important
to get
decent compression on your data than it was in the days of
Elite and
Exile!

(*) I have a tiresome anecdote about inefficient data
transfer over
NFSv3 versus NFSv4 bringing our data centre to a
standstill.

J

-- 
/-----------------------------------------------------------
---------------
  James Aylett                                              
   xapian.org
  jamestartarus.org                              
uncertaintydivision.org

_______________________________________________
Xapian-discuss mailing list
Xapian-discusslists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-discuss

[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )