List Info

Thread: Re: Document clustering module?




Re: Document clustering module?
country flaguser name
United Kingdom
2007-09-18 14:46:30
On Tue, Sep 18, 2007 at 12:46:30PM +0800, ??? ????????? ???
(Yung-chung Lin) wrote:
> I have made a new class for calculating document
similarity. Please
> review it.

Generally looks good - a few comments below.

> Maybe the class should be an internal one, since this
will
> only be called by Xapian::Cluster in my plan.

I think a similarity measure probably has other uses
though.

Also, users might want to implement their own measure (like
they can do
with Xapian::Weight, etc) and let Xapian cluster using it.

> +class Xapian:ocSim::I
nternal : public Xapian::Internal::RefCntBase {
> +  public:
> +    Xapian:atabase
db;
> +};

If we want to allow user subclassing, we can't use
RefCntBase and
RefCntPtr.  Xapian::Weight is probably a good model, except
this
probably doesn't need to be serialisable.

> +Xapian:ocSim:ocSim(Xa
pian:ocSim::I
nternal * internal_)
> +{
> +    internal = internal_;
> +}
> +
> +Xapian:ocSim:ocSim()
> +        : internal(new Xapian:ocSim::I
nternal())
> +{
> +}
> +
> +Xapian:ocSim:ocSim(co
nst Xapian:ocSim
& other)
> +        : internal(other.internal)
> +{
> +}

Generally we should probably put trivial constructors like
these in the
headers so that then can be inlined.  Sometimes that means
other headers
need to be pulled in, which is one reason not to do this.

Similarly, trivial non-virtual methods (especially setters
and getters)
should probably go in the header.

> +libxapian_la_SOURCES += 
> +	docsim/docsim.cc docsim/docsim_cosine.cc

Use a vertical list here, so the "docsim/" bits
all line up.

> --- ../xapian-core-1.0.2/tests/docsimtest.cc	1970-01-01
08:00:00.000000000 +0800
> +++ tests/docsimtest.cc	2007-09-18 12:20:22.000000000
+0800

I think these tests should probably go in apitest, so we can
run them on
all backends (in particular, testing with the remote backend
would be
useful).

> +    /// Make a new empty DocSim
> +    DocSimCosine() { };

There's a bogus trailing ";" here.

> +    /// Destructor
> +    ~DocSimCosine() { };

And here.

Also, virtual functions generally shouldn't be declared
inline in the
header as they can't generally be inlined.

Cheers,
    Olly

_______________________________________________
Xapian-devel mailing list
Xapian-devellists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel

[1]

about | contact  Other archives ( Real Estate discussion Medical topics )