List Info

Thread: Resolved: (LUCENE-947) Some improvements to contrib/benchmark




Resolved: (LUCENE-947) Some improvements to contrib/benchmark
country flaguser name
United States
2007-07-25 04:06:31
     [ https://issues.apache.org/jira/browse/LUCENE-947?page=com.
atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Michael McCandless resolved LUCENE-947.
---------------------------------------

       Resolution: Fixed
    Fix Version/s: 2.3
    Lucene Fields: [New, Patch Available]  (was: [Patch
Available, New])

> Some improvements to contrib/benchmark
> --------------------------------------
>
>                 Key: LUCENE-947
>                 URL: http
s://issues.apache.org/jira/browse/LUCENE-947
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/benchmark
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.3
>
>         Attachments: LUCENE-947.patch,
LUCENE-947.take2.patch, LUCENE-947.take3.patch,
LUCENE-947.take4.patch, LUCENE-947.take5.patch
>
>
> I've made some small improvements to the
contrib/benchmark, mostly
> merging in the ad-hoc benchmarking code I've been using
in LUCENE-843:
>   - Fixed thread safety of DirDocMaker's usage of
SimpleDateFormat
>   - Print the props in sorted order
>   - Added new config "autocommit=true|false"
to CreateIndexTask
>   - Added new config "ram.flush.mb=int" to
AddDocTask
>   - Added new configs
"doc.term.vector.positions=true|false" and
>     "doc.term.vector.offsets=true|false" to
BasicDocMaker
>   - Added WriteLineDocTask.java, so you can make an alg
that uses this
>     to build up a single file containing one document
per line in a
>     single file.  EG this alg converts the reuters-out
tree into a
>     single file that has ~1000 bytes per body field,
saved to
>     work/reuters.1000.txt:
>       docs.dir=reuters-out
>      
doc.maker=org.apache.lucene.benchmark.byTask.feeds.DirDocMak
er
>       line.file.out=work/reuters.1000.txt
>       doc.maker.forever=false
>       {WriteLineDoc(1000)}: *
>     Each line has tab-separted TITLE, DATE, BODY
fields.
>   - Created feeds/LineDocMaker.java that creates
documents read from
>     the file created by WriteLineDocTask.java.  EG this
alg indexes
>     all documents created above:
>      
analyzer=org.apache.lucene.analysis.SimpleAnalyzer
>       directory=FSDirectory
>       doc.add.log.step=500
>       docs.file=work/reuters.1000.txt
>      
doc.maker=org.apache.lucene.benchmark.byTask.feeds.LineDocMa
ker
>       doc.tokenized=true
>       doc.maker.forever=false
>       ResetSystemErase
>       CreateIndex
>       : *
>       CloseIndex
>       RepSumByPref AddDoc
> I'll attach initial patch shortly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue
online.


------------------------------------------------------------
---------
To unsubscribe, e-mail: java-dev-unsubscribelucene.apache.org
For additional commands, e-mail: java-dev-helplucene.apache.org


[1]

about | contact  Other archives ( Real Estate discussion Medical topics )