List Info

Thread: Problems indexing many documents




Problems indexing many documents
user name
2008-04-14 09:14:04
Hi!

I try with this:
require_once 'Zend/Search/Lucene.php';
Zend_Search_Lucene_Analysis_Analyzer::setDefault(new
Zend_Search_Lucene_Analysis_Analyzer_Common_Utf8Num());
$index = new Zend_Search_Lucene('myindex', true);

$indexSourceDir = 'dirofhtmlsource';
$dir = opendir($indexSourceDir);

while (($file = readdir($dir)) !== false) {    
    if (is_dir($indexSourceDir . '/'; . $file)) {
       ; continue;
    };
    $doc = Zend_Search_Lucene_Document_Html::loadHTMLFile($indexSourceDir .'/9;.$file,true);  // I need parameter TRUE
 ;   $index->;addDocument($doc);        
    flush();
}
closedir($dir);


It works fine with 50 documents.htm.
But It not work with 1500 douments (I need that or more).
Why ?

Thanks and regards
Lucio Torrico
RE: Problems indexing many documents
user name
2008-04-14 12:46:56

What does your server infrastructure look like? This operation is slow on windows (due to File I/O bottlenecks in the O/S) and is probably just tripping your memory/execution time limits, which if you can, you can adjust in php.ini

- Eric Marden


-----Original Message-----
From: Lucio Torrico [ luciotorricogmail.com">mailto:luciotorricogmail.com]
Sent: Mon 4/14/2008 10:14 AM
To: fw-formatslists.zend.com
Subject: [fw-formats] Problems indexing many documents

Hi!

I try with this:
require_once 'Zend/Search/Lucene.php';
Zend_Search_Lucene_Analysis_Analyzer::setDefault(new
Zend_Search_Lucene_Analysis_Analyzer_Common_Utf8Num());
$index = new Zend_Search_Lucene('myindex', true);

$indexSourceDir = 'dirofhtmlsource';
$dir = opendir($indexSourceDir);

while (($file = readdir($dir)) !== false) {
   ; if (is_dir($indexSourceDir . '/' . $file)) {
   ;     continue;
    };
 ;   $doc = Zend_Search_Lucene_Document_Html::loadHTMLFile($indexSourceDir
.'/'.$file,true); ; // I need parameter TRUE
    $index->;addDocument($doc);
  ;  flush();
}
closedir($dir);


It works fine with 50 documents.htm.
But It not work with 1500 douments (I need that or more).
Why ?

Thanks and regards
Lucio Torrico

[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )