>>Could someone give me a clue as to why the test case
TestRemoteCachingWrapperFilter fails with the patch applied?
Regardless of the reasons for this particular test failure,
this code is not safe in other ways which the test cases
don't test for.
To restate the issue: Matcher is not designed to be
threadsafe and CachingWrapperFilter (or any other example of
existing caching strategies) cannot therefore simply be
changed to cache Matchers in place of the existing scheme of
caching bitsets (which are currently used in a thread-safe
manner by all Lucene code). Bitsets don't offer the notion
of a cursor (required for "next" methods) while
Matcher does which spoils it's potential for reuse/shared
use. The remoting test code you refer to uses your modified
CachingWrapperFilter which has swapped Matchers for BitSets
and so I would anticipate thread safety issues if the tests
actually tried to share/reuse the same Matcher.
>>Finally, are DocIdSet and DocIdSetIterator currently
part of Lucene? I don't know how to go about these.
These are two of the names I gave to a notional set of 3
services that I outlined here:
https://issues.apache.org/jira/browse/LUCENE-584
#action_12518642
I introduced this terminology to the discussion because:
1) It describes 2 services that are currently combined in
Matcher that I feel need to be separated
2) It uses a more generic description of the services
offered that can be useful when considering other
applications of the services (e.g. category count and
filtering logic both can use cached sets of doc IDs.
DocIdSet seemed to describe the service more generically
than "Matcher")
I'm happy to drop use of these terms from this discussion if
you feel they are not useful.
Cheers
Mark
----- Original Message ----
From: Paul Elschot <paul.elschot xs4all.nl>
To: java-dev lucene.apache.org
Sent: Friday, 10 August, 2007 8:45:09 AM
Subject: Fwd: Decouple Filter from BitSet: API change and
xml query parser
Taking this to java-dev only.
As I said at the jira issue, I'd like to have all test cases
pass again,
and I'm not happy with the current version of the patch to
the xml query
parser either.
Some test cases currently fail maybe because they use RMI
and the
new version of Filter does serialize well because the result
of getMatcher()
is not serializable.
It should be possible to fix this by moving Filter to
BitSetFilter in these
cases, see also below.
The problem is that I don't know how to do this because I
have never
used java RMI myself.
Could someone give me a clue as to why the test case
TestRemoteCachingWrapperFilter fails with the patch applied?
As for the API change, how to move from the current:
public class Filter {
abstract public BitSet bits(IndexReader);
}
to:
public class Filter {
abstract public Matcher getMatcher(IndexReader);
}
The patch proposes to do this by moving all current use of
Filter to
BitSetFilter:
public class BitSetFilter extends Filter {
abstract public BitSet bits(IndexReader);
}
Would it be good to have an intermediate version of Filter
like this
one:
public class Filter {
/** deprecated, use class BitSetFilter instead */
public BitSet bits(IndexReader); {return null;}
abstract public Matcher getMatcher(IndexReader);
}
Finally, are DocIdSet and DocIdSetIterator currently part of
Lucene?
I don't know how to go about these.
Regards,
Paul Elschot
---------- Forwarded Message ----------
Subject: [jira] Commented: (LUCENE-584) Decouple Filter from
BitSet
Date: Friday 10 August 2007 01:15
From: "Mark Harwood (JIRA)" <jira apache.org>
To: java-dev lucene.apache.org
[ https://issues.apache.org/jira/browse/
LUCENE-584?page=com.atlassian.jira.plugin.system.issuetabpan
els:comment-tabpanel#action_12518868 ]
Mark Harwood commented on LUCENE-584:
-------------------------------------
OK, I appreciate caching may not be a top priority in this
proposal but I have
live systems in production using XMLQueryParser and which
use the existing
core facilities for caching. As it stands this proposal
breaks this
functionality (see "FIXME" in contrib's
CachedFilterBuilder and my concerns
over use of unthreadsafe Matcher in the core class
CachingWrapperFilter)
I am obviously concerned by this and keen to help shape a
solution which
preserves the existing capabilities while adding your new
functionality. I'm
not sure I share your view that support for caching can be
treated as a
separate issue to be dealt with at a later date. There are a
larger number of
changes proposed in this patch and if the design does not at
least consider
future caching issues now, I suspect much will have to be
reworked later. The
change I can envisage most clearly is expressed in my
concern that the
DocIdSet and DocIdSetIterator services I outlined are being
combined in
Matcher as it stands now and these functions will have to be
separated.
Cheers
Mark
> Decouple Filter from BitSet
> ---------------------------
>
> Key: LUCENE-584
> URL: http
s://issues.apache.org/jira/browse/LUCENE-584
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Search
> Affects Versions: 2.0.1
> Reporter: Peter Schäfer
> Priority: Minor
> Attachments: bench-diff.txt, bench-diff.txt,
Matcher1-ground-20070730.patch,
Matcher2-default-20070730.patch,
Matcher3-core-20070730.patch,
Matcher4-contrib-misc-20070730.patch,
Matcher5-contrib-queries-20070730.patch,
Matcher6-contrib-xml-20070730.patch,
Some Matchers.zip
>
>
>
> package org.apache.lucene.search;
> public abstract class Filter implements
java.io.Serializable
> {
> public abstract AbstractBitSet bits(IndexReader
reader) throws
IOException;
> }
> public interface AbstractBitSet
> {
> public boolean get(int index);
> }
>
> It would be useful if the method =Filter.bits()=
returned an abstract
interface, instead of =java.util.BitSet=.
> Use case: there is a very large index, and, depending
on the user's
privileges, only a small portion of the index is actually
visible.
> Sparsely populated =java.util.BitSet=s are not
efficient and waste lots of
memory. It would be desirable to have an alternative BitSet
implementation
with smaller memory footprint.
> Though it _is_ possibly to derive classes from
=java.util.BitSet=, it was
obviously not designed for that purpose.
> That's why I propose to use an interface instead. The
default implementation
could still delegate to =java.util.BitSet=.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue
online.
------------------------------------------------------------
---------
To unsubscribe, e-mail: java-dev-unsubscribe lucene.apache.org
For additional commands, e-mail: java-dev-help lucene.apache.org
-------------------------------------------------------
------------------------------------------------------------
---------
To unsubscribe, e-mail: java-dev-unsubscribe lucene.apache.org
For additional commands, e-mail: java-dev-help lucene.apache.org
___________________________________________________________
Yahoo! Answers - Got a question? Someone out there knows the
answer. Try it
now.
http://uk.answers.yahoo.
com/
------------------------------------------------------------
---------
To unsubscribe, e-mail: java-dev-unsubscribe lucene.apache.org
For additional commands, e-mail: java-dev-help lucene.apache.org
|