Hi Manish,
Thanks for the info.
Manish Singh wrote:
> On Thu, Sep 13, 2007 at 03:27:20PM -0400, Otis
Gospodnetic wrote:
>> Hola,
>>
>> I'm one of Lucene developers (n.b. Flock is using
CLucene, the C++ port
>> of Lucene). Out of curiosity - have you considered
using Lucene (the
>> original Java version) in Flock? I'm asking
because CLucene and other
>> ports are always pretty far behind the Java
version, and over at Lucene
>> Java we've made some major performance improvements
recently (plus a
>> good number of new features). I imagine Flock
would benefit from using
>> the most advanced version, but perhaps there are
technical reasons why
>> the Java version cannot be used in Flock?
>
> Yeah, I see 3 problems:
>
> 1) No guaranteed pre-installed JRE on Windows or Linux.
This either
> means the user must install a JRE beforehand, which
is an additional
> barrier for new users, or we bundle a JRE, which
means the download
> much bigger, which is also a barrier for new users.
I was always why this is still an argument. 5MB or 20MB
download....
does that still represent a problem? Esp. for the type of
people Flock
is aimed at?
> 2) The Java XPCOM binding has never seen wide testing,
since it's never
> been part of any default builds, and thus is still
immature.
I see. That is what I thought....though I remember
something from
Stefano Mazzocchi on that topic....google... aha
http://www.betaversion.org/~stefano/linotype/news/89/
> 3) Java has poor memory profile characteristics. Java's
general reputation
> of sucking up whatever memory it can grab, there's
the specific
> problem that because it's GC'd, the GC might not get
run fast enough
> to keep up with indexing operations (consider pages
which refresh
> themselves often), which to the user, looks like
memory spirialing
> out of control, even though at some later point,
it'll actually
> get cleaned up. This isn't specific to Java, we
pushed some code down
> from JavaScript to C++ for the same reason.
I'm not sure I'd agree 100%, but I'm old enough not to get
into that
discussion When you
talk about indexing and frequently refreshing
pages... are you saying Flock periodically refetches and
reindexes them?
I thought it indexed pages only as people visit them, no?
> Lucene was also slower performance wise than CLucene,
but I haven't
> actually revisited this in over a year, so I'm sure you
guys have made
> some improvements since then. Do you have any recent
benchmarks between
> the two?
I don't think there are any recent benchmarks. I didn't
realize the
main concern is indexing performance (as opposed to search
or features),
but I know some recent improvements made indexing 25% faster
(background
threading and such).
Otis
--
Simpy -- http://www.simpy.com/ --
Tag. Search. Share.
_______________________________________________
Flockstars mailing list
Flockstars flock.com
h
ttps://lists.flock.com/mailman/listinfo/flockstars
|