List Info

Thread: only update certain document fields




only update certain document fields
user name
2006-11-22 08:54:06
Hi Marvin,

I'm currently building a search system which allows to
create Kinosearch 
indexes feeded by fields from SQL-Tables. The configuration
allows to 
select certain fields and compose them into one ore more
kinosearch 
fields, which means I have a layer in my application which
detects which 
fields are updated and if one of the fields is configured to
be a 
Kinosearch Field I have to delete and re-add the updated
document.

Problem is: my database layer only updates the needed fields
in the 
tables (by UPDATE SET ...), but when I want to re-index the
Kinosearch 
document I have to select all the SQL-Fields based on the
configuration 
for that tables, but what I'd like to do is only update the
changed 
field in Kinosearch. Example:

I have a persons table in SQL like this:

Surname  Prename  Address  ZIP  City  Country  Phone
=======  =======  =======  ===  ====  =======  =====
Mueller  Arnold   A-Street 123  Some  Here     123456

and in Kinosearch this is indexed like this:

Field 'Name': 'Mueller Arnold'
Field 'Address': 'A-Street, 123 Some, Here'
Field 'Phone': '123456'

When I now update the 'Address' Field in the table I always
have to 
select all fields, rebuild the kinosearch fields and re-add
the document.

Would it be possible to only select the SQL-Fields 'Address,
ZIP, City, 
Country' and update only the 'Address' Field in Kinosearch?
If so, how 
can I do this? Can I update the document somehow, or is it
possible to 
read the already indexed parts, update them with the
modified data and 
then delete and re-add the document(but how do I preserve
vector data, 
boost and and things like this then?)

Best regards,

Marc


_______________________________________________
KinoSearch mailing list
KinoSearchrectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch

only update certain document fields
user name
2006-11-24 13:39:09
On Nov 22, 2006, at 12:54 AM, Marc Elser wrote:
> Would it be possible to only select the SQL-Fields
'Address, ZIP,  
> City, Country' and update only the 'Address' Field in
Kinosearch?

It's not possible to update anything in KS/Lucene.  You can
only  
delete/add.  Aside from the deletions files, which are
per-segment  
bitmaps with one bit per document, no segment files are ever
modified  
once written.

The serialized, stored documents are housed in a single
file,  
the .fdt file.  File pointers to individual documents are
housed in  
the .fdx file, which is a pile of 64-bit integers.  You
can't go into  
the .fdt file and modify it.  It's just not designed for
anything  
other than recovery.

But let's say you could.  There's a more serious problem. 
The index  
files in each segment are created by tearing each document
in the  
collection apart, then pooling the fragments and
reassembling them in  
sorted order which can be searched.  Say you modified the
content of  
document 1274 in a 10000-document segment.  Now you have
little bits  
and pieces that are wrong scattered throughout these index  
structures, with no good way to go in and modify them.

Think of the index at the end of a book.  If you change the
contents  
of a chapter, then a few of the entries on every index page
now need  
to be modified.  There's no good way to handle that other
than to  
regenerate the index.

> If so, how can I do this? Can I update the document
somehow, or is  
> it possible to read the already indexed parts, update
them with the  
> modified data and then delete and re-add the
document(but how do I  
> preserve vector data, boost and and things like this
then?)

Deleting and re-adding is the only solution.

Best,

Marvin Humphrey
Rectangular Research
http://www.rectangular.co
m/



_______________________________________________
KinoSearch mailing list
KinoSearchrectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch

[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )