List Info

Thread: Question on design of a long-running BSD-based reader/writer class




Question on design of a long-running BSD-based reader/writer class
user name
2007-01-05 14:14:08
Hi Andi

Thanks for the answer.

| While there is a certain overhead with transactions and
opening and closing
| an index for every addition, I did notice that there was a
fair amount of
| thrashing around in the Lucene directory I/O and got
things to be
| considerably faster by batching all updates and doing them
in a
| RAMDirectory before adding the RAMDirectory contents to
the DBDirectory via
| the addIndexes API.

I've thought about this (after reading the suggestion in
Lucene in Action).
I considered having an open RAMDirectory that is always
being written to
and which is merged into a FSDirectory whenever a search
takes place. That
would be ok for some cases, but not in general. Also,
buffering approaches
using RAMDirectory seem not to support transactions - at
least not at the
level of single additions to the RAMDirectory. That's
something of a
problem for me, but adding some sort of transaction
mechanism might work.

Terry
_______________________________________________
pylucene-dev mailing list
pylucene-devosafoundation.org
http://lists.osafoundation.org/mailman/listinfo/pylu
cene-dev
Question on design of a long-running BSD-based reader/writer class
user name
2007-01-05 18:35:09
On Fri, 5 Jan 2007, Terry Jones wrote:

> | While there is a certain overhead with transactions
and opening and closing
> | an index for every addition, I did notice that there
was a fair amount of
> | thrashing around in the Lucene directory I/O and got
things to be
> | considerably faster by batching all updates and doing
them in a
> | RAMDirectory before adding the RAMDirectory contents
to the DBDirectory via
> | the addIndexes API.
>
> I've thought about this (after reading the suggestion
in Lucene in Action).
> I considered having an open RAMDirectory that is always
being written to
> and which is merged into a FSDirectory whenever a
search takes place. That
> would be ok for some cases, but not in general. Also,
buffering approaches
> using RAMDirectory seem not to support transactions -
at least not at the
> level of single additions to the RAMDirectory. That's
something of a
> problem for me, but adding some sort of transaction
mechanism might work.

The transaction is used with DBDirectory. Use a RAMDirectory
to batch all 
changes for a given thread and once done, merge the
RAMDirectory into a 
DBDirectory within a transaction.

Andi..
_______________________________________________
pylucene-dev mailing list
pylucene-devosafoundation.org
http://lists.osafoundation.org/mailman/listinfo/pylu
cene-dev
[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )