List Info

Thread: KinoSearch 0.20_01




KinoSearch 0.20_01
country flaguser name
United States
2007-02-26 03:59:31
Greets,

I've uploaded 0.20_01 to both CPAN and <http://www.rectangular.co
m/ 
kinosearch/>, and I'd appreciate it if people could give
it a try.

This is an initial developer's release, and is not
recommended for  
production.

Change log below.

Marvin Humphrey
Rectangular Research
http://www.rectangular.co
m/


0.20_01 2007-02-26

   KinoSearch 0.20 is a major rewrite, adding many new
features.  It  
also
   breaks backwards compatibility in a number of ways.

   Two key features, UTF-8 support and custom sorting, were
not  
possible to
   implement while preserving backwards compatibility.  Once
the  
decision was
   made to proceed with them, breaking all existing
installations, it  
made
   little sense to proceed by half measures, so the API has
been given a
   significant overhaul.

   KinoSearch has always carried an "alpha code"
warning; it is being  
invoked
   for this release.  While it will continue to carry the
"alpha"  
warning for
   a short while longer, the point of jamming so many
changes into  
one release
   is to cause disruption only once; once the code in 0.20
proves  
itself,
   hopefully no more backwards incompatible changes will be
needed  
any time
   soon.

   New behaviors:

     * KinoSearch now uses UTF-8 for all input and output,
throughout  
the
       entire library.  This affects many classes, but
particularly  
those under
       Analysis, Highlight, and QueryParser.
     * The default scoring algorithm has changed subtly --
aggressive
       per-field boosting is no longer important or even
desirable.   
The old
       behavior is available from
KinoSearch::Contrib::LongFieldSim.

   New public classes:

     * KinoSearch::Schema
     * KinoSearch::Schema::Field
     * KinoSearch::InvIndex
     * KinoSearch::Analysis::Token
     * KinoSearch::Search::RangeFilter
     * KinoSearch::Search::SortSpec
     * KinoSearch::Search::Similarity
     * KinoSearch::Contrib::LongFieldSim

   New documentation:

     * KinoSearch:ocs::NFS


   Removed classes:

     * KinoSearch:ocument:
oc
     * KinoSearch:ocument:
:Field
     * KinoSearch::Search::Hit

   Renamed classes:

     * KinoSearch::Store::InvIndex    =>
KinoSearch::Store::Folder
     * KinoSearch::Store::FSInvIndex  =>
KinoSearch::Store::FSFolder
     * KinoSearch::Store::RAMInvIndex =>
KinoSearch::Store::RAMFolder

   Updated documentation:

     * KinoSearch
     * KinoSearch:ocs:evGuide
     * KinoSearch:ocs::Fil
eFormat
     * KinoSearch:ocs::Tut
orial

   Classes with API changes:

     * KinoSearch::InvIndexer
       o new() - Args changed.
         * create - Removed.
         * analyzer - Removed.
         * lock_id - Added.
       o spec_field() - Removed.
       o new_doc() - Removed.
       o add_doc() - Args changed.
         * Takes a hashref rather than a Doc object.
         * Accepts optional labeled param 'boost'.
       o delete_docs_by_term() - Removed.
       o delete_by_term() - Added.  (Behavior differs subtly
from
         delete_docs_by_term()).

     * KinoSearch::Searcher
       o new() - args changed.
         * analyzer - Removed.
       o search() - Now calls Hits->seek before returning
Hits  
object.  Args
           changed.
         * offset - Added.
         * num_wanted - Added.
         * sort_spec - Added.

     * KinoSearch::Search::Hits
       o Now comes pre-seeked, courtesy of changes to
Searcher.
       o seek() - No longer triggers new number crunching if
 
requested values
         can be accomodated using results of prior search.
       o fetch_hit() - Removed.
       o create_excerpts() - Now puts multiple excerpts
under $hit-> 

         rather than one under $hit->.

     * KinoSearch::Search::MultiSearcher
       o new() - Args changed.
         * schema - Added.
         * analyzer - Removed.

     * KinoSearch::Highlight::Highlighter
       o new() - Args changed.
         * fields - Added.
         * excerpt_length - Now specified in characters
rather than  
bytes.
         * excerpt_field - Removed.
         * pre_tag - Removed.
         * post_tag - Removed.

     * KinoSearch::QueryParser::QueryParser
       o new() - Args changed.
         * schema - Added.
         * default_field - Removed.
         * analyzer - No longer required -- now used to
override schema.

     * KinoSearch::Analysis::TokenBatch
       o new() - Args changed.
         * text - Added.
       o next() - Returns a Token instead of a boolean.
       o reset() - Added.
       o add_many_tokens() - Added.
       o set_text(), get_text(), set_start_offset(),
get_start_offset(),
         set_end_offset(), get_end_offset(), set_pos_inc(), 

get_pos_inc - All
         removed.

   Internal changes:

     Large-scale refactoring has taken place.  The most
significant
     changes are...

     * OO framework imposed on C code via boilerplater.pl,
with
       KinoSearch::Util::Obj as the base class.
     * Charmonizer added.
     * perlapi functions and data structures replaced
whenever possible.
     * Lots of classes, especially under KinoSearch::Index, 

reorganized around
       Schema and SegInfo.
     * Many tests added, removed, or revised to accomodate
changes in  
the main
       library code.
     * C code moved to dedicated files.
     * Build.PL custom code moved to
buildlib/KinoSearchBuild.pm

   File Format:

     * Significantly redesigned.  The most visible change is
that the  
segments
       file is now encoded using YAML rather than an
arbitrary binary  
format.
     * Old indexes cannot be read and must be regenerated.

   Locking

     * write.lock files now located in the index directory
rather than
       under /tmp.
     * Commit locks are no longer needed due to file format
changes.
     * Stale write locks are now removed without warning.






_______________________________________________
KinoSearch mailing list
KinoSearchrectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch


Re: KinoSearch 0.20_01
country flaguser name
United States
2007-02-26 09:13:51
Marvin Humphrey wrote:
>    Two key features, UTF-8 support and custom sorting,
were not 
                                          ^^^^^^^^^^^^^^

Beautiful!  I will surely try that out.

   - Dmitri.


_______________________________________________
KinoSearch mailing list
KinoSearchrectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch


Re: KinoSearch 0.20_01
country flaguser name
United States
2007-02-26 10:14:27

Thank you!  I'll be installing this on all our development
hosts tommorrow!

-Miles
__________________________________
Miles Crawford, Software Developer
Catalyst Research & Development
Office of Learning & Scholarly Technologies
University of Washington
206.616.3406

http://catalyst.washin
gton.edu
http://solstice.e
plt.washington.edu


On Mon, 26 Feb 2007, Marvin Humphrey wrote:

> Greets,
>
> I've uploaded 0.20_01 to both CPAN and 
> <http://www
.rectangular.com/kinosearch/>, and I'd appreciate it
if people 
> could give it a try.

_______________________________________________
KinoSearch mailing list
KinoSearchrectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch


Re: KinoSearch 0.20_01
country flaguser name
South Africa
2007-02-28 13:38:36

> Greets,
>
> I've uploaded 0.20_01 to both CPAN and <http://www.rectangular.co
m/
> kinosearch/>, and I'd appreciate it if people could
give it a try.
>
> This is an initial developer's release, and is not
recommended for
> production.
>
> Change log below.

Wow!  I just had a chance to scan the release email.  Well
done Marvin -
this is a major release representing a lot of work.

Your efforts are appreciated.

Regards
Henry


_______________________________________________
KinoSearch mailing list
KinoSearchrectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch


[1-4]

about | contact  Other archives ( Real Estate discussion Medical topics )