|
|
| pylucene trunk and gcc-4.2 : java
configuration ? |
  France |
2007-03-30 04:09:27 |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi.
I try use apache mod_python with pylucene.
I read the lastest messages :
http://www.m
ail-archive.com/pylucene-dev osafoundation.org/msg01219.html
http://www.m
ail-archive.com/pylucene-dev osafoundation.org/msg01229.html
Then I try to compile pylucene with GCC4.2.
I try compile 2 versions :
1) pylucene version src : lastest source release.
http://downloads.osafoundation.org/Py
Lucene/src/PyLucene-src-2.1.0-2.tar.gz
my methode is here :
http:
//www.1et0.org/admin/db/pylucene/src-gcc42
in make test
I have crash without message.
in the test/test_Binary.py
line v = field.binaryValue()
With a GCC344, there is no error,
but it't not work with mod_python (thread error)
http://www.1et0.org/admin/db/pylucene/django-with
out-mod-python
I imagine this version is not compatible with GCC42.
2) pylucene version trunk : 29/03/2007 svn version.
http://sv
n.osafoundation.org/pylucene/trunk
My methode is here :
http://www.1et0.org/admin/db/pylucene/trunk-gcc42wget
With svn trunk, I have some java error.
make test
==> 9 errors with :
(__main__.Test_PyLuceneWithDbStore)
JavaError: java.lang.NoClassDefFoundError:
com.sleepycat.db.internal.db_javaJNI
Same error with GCC344.
I imagine you could help me here.
CLASSPASS or JAVA_HOME to configure in makefile,
or something like that ?
Thx, Xavier.
xav 1et0.org
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFGDNPH4dg3EDuiPUcRAvMmAJkBbuTk08EcIf/KxeEJtRRMs38aPACg
hy/g
0r/LxHGfxsvX8EaqTb4qLys=
=vP/l
-----END PGP SIGNATURE-----
_______________________________________________
pylucene-dev mailing list
pylucene-dev osafoundation.org
http://lists.osafoundation.org/mailman/listinfo/pylu
cene-dev
|
|
| Re: pylucene trunk and gcc-4.2 : java
configuration ? |
  United States |
2007-03-30 11:12:14 |
On Fri, 30 Mar 2007, xav4django wrote:
> 1) pylucene version src : lastest source release.
> http://downloads.osafoundation.org/Py
Lucene/src/PyLucene-src-2.1.0-2.tar.gz
> my methode is here :
> http:
//www.1et0.org/admin/db/pylucene/src-gcc42
For a reason I don't know, the source tarballs I release are
not compatible
with gcj 4.x. To compile PyLucene with gcj 4.2, you must
start from the svn
sources and use a sane (Sun, Apple, Blackdown) JDK 1.4 or
1.5 with ant. You
cannot use gcj to compile the .java sources, there are too
many bugs with
that.
For an example (on Ubuntu 64), see:
http://lists.osafoundation.org/pi
permail/pylucene-dev/2006-November/001404.html
Once you get that working, you're still not done if you want
to experiment
with adding threads after the fact. You then need to create
a wrapper for
JvAttachCurrentThread() and call it from your python code.
For an example on how to wrap a simple C/C++ function, see
_PyLucene.cpp's
dumpRefs() function.
Andi..
_______________________________________________
pylucene-dev mailing list
pylucene-dev osafoundation.org
http://lists.osafoundation.org/mailman/listinfo/pylu
cene-dev
|
|
| Re: pylucene trunk and gcc-4.2 : java
configuration ? |
  France |
2007-03-30 12:53:44 |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi
I found my stupid compilation mistake :
http://downloads.osafoundation.org/Py
Lucene/src/PyLucene-src-2.1.0-2.tar.gz
have the folder lucene-java-2.1.0-509013
and the svn
http://s
vn.osafoundation.org/pylucene/trunk/ pylucene-2.1-trunk
don't have this folder.
without lucene, pylucene don't found this jar class.
normal.
I will try again monday.
> You then need to create a wrapper for
JvAttachCurrentThread()
> and call it from your python code.
Ok, I will try too.
This one is not ok ?
http://lists.osafoundation.org/piperm
ail/pylucene-dev/2004-June/000036.html
Thx.
Xavier.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFGDU6o4dg3EDuiPUcRAmZTAJ97+v6A+HQCpftNsDaYdrFbtiUrQACf
UGTP
ICXWDdS83O10C4XIMRztP28=
=Tfni
-----END PGP SIGNATURE-----
_______________________________________________
pylucene-dev mailing list
pylucene-dev osafoundation.org
http://lists.osafoundation.org/mailman/listinfo/pylu
cene-dev
|
|
| Re: pylucene trunk and gcc-4.2 : java
configuration ? |
  United States |
2007-03-30 13:45:33 |
On Fri, 30 Mar 2007, xav4django wrote:
> I found my stupid compilation mistake :
>
> http://downloads.osafoundation.org/Py
Lucene/src/PyLucene-src-2.1.0-2.tar.gz
> have the folder lucene-java-2.1.0-509013
> and the svn
> http://s
vn.osafoundation.org/pylucene/trunk/ pylucene-2.1-trunk
> don't have this folder.
> without lucene, pylucene don't found this jar class.
normal.
No. You can't use the source tarball at all with gcj 4.x,
you have to start
from an svn checkout and make sure you have a sane JDK to
first compile the
Lucene java sources that the PyLucene build process will
export from the
Lucene svn repository.
>> You then need to create a wrapper for
JvAttachCurrentThread()
>> and call it from your python code.
>
> Ok, I will try too.
>
> This one is not ok ?
> http://lists.osafoundation.org/piperm
ail/pylucene-dev/2004-June/000036.html
No, SWIG is not used anymore since PyLucene 2.0.
Andi..
_______________________________________________
pylucene-dev mailing list
pylucene-dev osafoundation.org
http://lists.osafoundation.org/mailman/listinfo/pylu
cene-dev
|
|
| Re: pylucene trunk and gcc-4.2 : java
configuration ? |
  United States |
2007-04-02 12:30:45 |
On Mon, 2 Apr 2007, xav4django wrote:
> My compilation is ok now, with Makefile modification :
>
> PREFIX=/usr/local
> PREFIX_PYTHON=/usr
> LIBDIR_NAME=lib
> GCJ_HOME=/usr/local/gcc-4.2
> GCJ_LIBDIR=$(GCJ_HOME)/$(LIBDIR_NAME)
> GCJ_STATIC=1
> LIB_INSTALL=libgcj.so.8 libstdc++.so.6 libgcc_s.so.1
> WCCFLAGS=-Wno-write-strings
> #DB=$(PYLUCENE)/db-$(DB_VER)
> PREFIX_DB=$(PREFIX)/BerkeleyDB.$(DB_LIB_VER)
> ANT=ant
> PYTHON=$(PREFIX_PYTHON)/bin/python
>
> (I comment the DB line.)
> and test are ok !
I find it hard to believe with GCJ_STATIC=1. I've not been
able to get
_PyLucene.so to link statically against libgcj.a, and work,
with gcj 4.x.
Andi..
_______________________________________________
pylucene-dev mailing list
pylucene-dev osafoundation.org
http://lists.osafoundation.org/mailman/listinfo/pylu
cene-dev
|
|
| Re: patch JvAttachCurrentThread |
  United States |
2007-04-02 13:28:42 |
On Mon, 2 Apr 2007, xav4django wrote:
> Patch pour _PyLucene avec method JvAttachCurrentThread2
:
> http://www.1et0.org/admin/db/pylucene/trypatch/patch.txt
>
> This patch correct a thread problem,
What thread problem does it correct ?
Can one use threads without PyLucene.PythonThread now ?
> +static PyObject *JvAttachCurrentThread1(void){
> + JvCreateJavaVM(NULL);
> + JvAttachCurrentThread(NULL, NULL);
> + return Py_None;
> +}
You shouldn't need to call JvCreateJavaVM() again. It's
already called in
PyLucene's in init at import time.
Andi..
_______________________________________________
pylucene-dev mailing list
pylucene-dev osafoundation.org
http://lists.osafoundation.org/mailman/listinfo/pylu
cene-dev
|
|
| Re: patch JvAttachCurrentThread |
  United States |
2007-04-02 15:05:11 |
On Monday April 2 2007 1:40 pm, xav4django wrote:
> > What thread problem does it correct ?
> > Can one use threads without PyLucene.PythonThread
now ?
>
> Yes
>
> When I use django shell, with a
> from threading import Thread
Don't. Do. That.
http://wiki.osafoundation.org/PyLucene/ThreadingInPyLu
cene
--
Peter Fein || 773-575-0694 || pfein pobox.com
http://www.pobox.com/~pf
ein/ || PGP: 0xCCF6AE6B
irc: pfein freenode.net || jabber: peter.fein gmail.com
_______________________________________________
pylucene-dev mailing list
pylucene-dev osafoundation.org
http://lists.osafoundation.org/mailman/listinfo/pylu
cene-dev
|
|
| pylucene and recommendations for RAM |
  Canada |
2007-04-05 09:33:41 |
I am interested in general advice on running a server for
indexing only.
I would appreciate recommendations for RAM requirements
based on your
experiences.
I realize that the amount of RAM needed will be based on the
size of the
index, how many documents and what you are storing in the
index itself -
but some anecdotal information would be helpful. I am
looking at an
index that could reach 20 - 50 million documents. Will a
commodity
server with 2Gb be enough?
I guess it is possible to build a test index with sample
data to
determine this also. Many thanks.
Regards,
David
_______________________________________________
pylucene-dev mailing list
pylucene-dev osafoundation.org
http://lists.osafoundation.org/mailman/listinfo/pylu
cene-dev
|
|
| Re: pylucene and recommendations for
RAM |
  United States |
2007-04-05 09:59:26 |
On Thursday April 5 2007 9:33 am, David Pratt wrote:
> I realize that the amount of RAM needed will be based
on the size of the
> index, how many documents and what you are storing in
the index itself -
> but some anecdotal information would be helpful. I am
looking at an
> index that could reach 20 - 50 million documents. Will
a commodity
> server with 2Gb be enough?
IIRC, it's more a function of how quickly you're adding data
than total size.
Though this may be incorrect when merging segments (aka
optimizing). A fast
disk helps quite a lot too.
You'll want to configure the IndexWriter for bulk loading.
The relevant items
are setMergeFactor, which controls how often segments are
merged on disk, and
setMaxBufferedDocs, which controls how many docs are held in
RAM before being
written out. A higher value for both will be faster, though
be aware that an
index build with a high merge factor is slower to query, so
you'd probably
want to optimize() at the end. On our indexing server, with
~4kb documents,
setMaxBufferedDocs(200) uses about 700MB of RAM. See the
Javadocs & Lucene
In Action for more details.
On the searching front, a dedicated commodity box w/ 2 GB
can probably serve
around 2 million documents (again, depending on document
size). Multiple
CPUs will let you serve more simultaneous queries.
> I guess it is possible to build a test index with
sample data to
> determine this also. Many thanks.
You should probably ask the Lucene list, but please report
any test results
here as well (you could put them on the wiki too).
--
Peter Fein || 773-575-0694 || pfein pobox.com
http://www.pobox.com/~pf
ein/ || PGP: 0xCCF6AE6B
irc: pfein freenode.net || jabber: peter.fein gmail.com
_______________________________________________
pylucene-dev mailing list
pylucene-dev osafoundation.org
http://lists.osafoundation.org/mailman/listinfo/pylu
cene-dev
|
|
| Re: pylucene and recommendations for
RAM |
  Canada |
2007-04-05 10:43:50 |
Hi Pete. Many thanks for this advice. It would seem that
perhaps a
cluster would best solve this and then spread over some
number of lower
end servers. From what i read on large indexing, this seems
to be the
approach (but with as much RAM as possible per server). I am
looking at
costs so the lower end 2GB RAM servers are attractive but
just use more
of them.
I have only used pylucene for tests on smaller indexes. Is a
cluster
arrangement possible using pylucene? I am not a java
programmer so would
like to stay with what I know. Many thanks.
Regards,
David
Pete wrote:
> On Thursday April 5 2007 9:33 am, David Pratt wrote:
>> I realize that the amount of RAM needed will be
based on the size of the
>> index, how many documents and what you are storing
in the index itself -
>> but some anecdotal information would be helpful. I
am looking at an
>> index that could reach 20 - 50 million documents.
Will a commodity
>> server with 2Gb be enough?
>
> IIRC, it's more a function of how quickly you're adding
data than total size.
> Though this may be incorrect when merging segments (aka
optimizing). A fast
> disk helps quite a lot too.
>
> You'll want to configure the IndexWriter for bulk
loading. The relevant items
> are setMergeFactor, which controls how often segments
are merged on disk, and
> setMaxBufferedDocs, which controls how many docs are
held in RAM before being
> written out. A higher value for both will be faster,
though be aware that an
> index build with a high merge factor is slower to
query, so you'd probably
> want to optimize() at the end. On our indexing server,
with ~4kb documents,
> setMaxBufferedDocs(200) uses about 700MB of RAM. See
the Javadocs & Lucene
> In Action for more details.
>
> On the searching front, a dedicated commodity box w/ 2
GB can probably serve
> around 2 million documents (again, depending on
document size). Multiple
> CPUs will let you serve more simultaneous queries.
>
>> I guess it is possible to build a test index with
sample data to
>> determine this also. Many thanks.
>
> You should probably ask the Lucene list, but please
report any test results
> here as well (you could put them on the wiki too).
>
_______________________________________________
pylucene-dev mailing list
pylucene-dev osafoundation.org
http://lists.osafoundation.org/mailman/listinfo/pylu
cene-dev
|
|