Author: Joerg Behrens
Email:
Message:
I have done the same work some time ago. The mission was to
create a fulltext over 600k files. Most files comes from
office programms like excel,winword vut also some old
lotus123,rar,arj,zip,tgz and other formats. I setup a bunch
of own parsers(shell and php scripts) to convert each
document into some nice and clean HTML.
Lacking fast hardware i use my old sgi origin server for
that job. I use a script to start several indexers time by
time. 'My' problem was the mysql server on that host. With
my slow cpus the database server was the bottleneck when the
indexer starts flushing its content. If serveral indexers
flushes at the same time you'll see a performance dropdown.
One of the reasons are that the indexers 'lock' the tables
so that other indexers have to wait for reading/writing.
In total it takes less 9h for indexing 600.000 files and the
database size is around 1GB or so.
regards
Joerg
Reply: <http://www.mnogosearch.org/board/message.php?id=17600&g
t;
------------------------------------------------------------
---------
To unsubscribe, e-mail: general-unsubscribe mnogosearch.org
For additional commands, e-mail: general-help mnogosearch.org
|