List Info

Thread: tte.py not working with zodb - no spam/ham counts




tte.py not working with zodb - no spam/ham counts
user name
2006-06-30 01:31:13
> I'm trying to finish switching from bsddb to zodb.
[...]
> Seems to work fine (it spits out all the usual
chattiness about number of
> messages trained and missed), but when I run
sb_dbexpimp.py on the
> result it shows 0 ham and 0 spam and generates a csv
file with just
>
>     0,0

As far as I can see contrib/tte.py just uses the standard
classifier
interface ("learn", "unlearn",
"spamprob"), right?  That should work
fine with ZODB - it should automatically persist all
attributes,
including "nham" and "nspam".

It appears to work for me (although I'm really not familiar
with tte.py).

Are you telling sb_dbexpimp.py it's a ZODB?

Marshall:~/spambayes tameyer$ env PYTHONPATH=. python
contrib/tte.py
-g ham.mbox -s spam.mbox -R -o
Storage:persistent_use_database:zodb -o
Storage:persistent
_storage_file:./hammie.fs -c .cull -v
No handlers could be found for logger
"ZODB.FileStorage"
*** round 1 ***
    2miss ham: 0.500000 <1060756493-6904164quibble.com>
miss spam: 0.000000 <1060756493-6904164quibble.com>
round:  1, msgs:    2, ham misses:   1, spam misses:   1,
0.5s
[snip round 2 to round 9]
*** round 10 ***
    2miss ham: 0.500000 <1060756493-6904164quibble.com>
miss spam: 0.500000 <1060756493-6904164quibble.com>
round: 10, msgs:    2, ham misses:   1, spam misses:   1,
0.4s
writing new ham mbox...
    1 of     1
writing new spam mbox...
    1 of     1
Marshall:~/spambayes tameyer$ env PYTHONPATH=. python
scripts/sb_dbexpimp.py -e
 -f test.csv -o Storage:persistent_storage_file:./hammie.fs
-o
Storage:persistent_use_database:zodb
No handlers could be found for logger
"ZODB.FileStorage"
Exporting database /Users/tameyer/spambayes/./hammie.fs to
file test.csv
Database has 10 ham, 10 spam, and 1903 words

=Tony.Meyer
_______________________________________________
SpamBayespython.org
htt
p://mail.python.org/mailman/listinfo/spambayes
Check the FAQ before asking: http://spambayes.sf.
net/faq.html
tte.py not working with zodb - no spam/ham counts
user name
2006-06-30 02:35:01
    >> Seems to work fine (it spits out all the usual
chattiness about
    >> number of messages trained and missed), but
when I run sb_dbexpimp.py
    >> on the result it shows 0 ham and 0 spam and
generates a csv file with
    >> just

    Tony> As far as I can see contrib/tte.py just uses
the standard
    Tony> classifier interface ("learn",
"unlearn", "spamprob"), right?

Yes.  I don't do anything fancy in that regard.  Do I need
to do some sort
of explicit commit (or a close operation that does the
commit for me)?

    Tony> Are you telling sb_dbexpimp.py it's a ZODB?

Yes.  I call it like so:

    sb_dbexpimp.py -o Storage:persistent_use_database:zodb
\
        -o Storage:persistent_storage_file:$HOME/hammie.db
\
        -e  -f ~/tmp/hammie.csv

The spamcounts script also fails to find anything in the
file:

    % spamcounts -o Storage:persistent_use_database:zodb -o
Storage:persistent_storage_file:hammie.db -r url
    token,nspam,nham,spam prob

I'm perplexed.  Is there some way to treat a zodb file
more-or-less like an
anydbm file (that is, a dict in a file)?  That way I could
simply poke
around in the file to see what it *does* contain.

Skip
_______________________________________________
SpamBayespython.org
htt
p://mail.python.org/mailman/listinfo/spambayes
Check the FAQ before asking: http://spambayes.sf.
net/faq.html
tte.py not working with zodb - no spam/ham counts
user name
2006-06-30 03:03:23
    Tony> As far as I can see contrib/tte.py just uses
the standard
    Tony> classifier interface ("learn",
"unlearn", "spamprob"), right?

    skip> Yes.  I don't do anything fancy in that
regard.  Do I need to do
    skip> some sort of explicit commit (or a close
operation that does the
    skip> commit for me)?

Looking at the dir() of a store object, I saw

    ['ClassifierClass', 'DB', '__class__',
'__delattr__', '__dict__',
    '__doc__', '__getattr__', '__getattribute__',
'__hash__', '__init__',
    '__module__', '__new__', '__reduce__',
'__reduce_ex__', '__repr__',
    '__setattr__', '__str__', '__weakref__',
'classifier', 'close',
    'closed', 'conn', 'create_storage',
'db_filename', 'db_name', 'load',
    'mode', 'storage', 'store']

which suggested tte.py should be calling store.close() in
addition to or
instead of store.store().  I tried adding a store.close()
call after the
store.store() call, but it didn't make any difference.  I
can manually add
hams to the store:

    >>> from spambayes import storage
    >>> dbname, usedb = storage.database_type([])
    >>> store = storage.open_storage(dbname, usedb)
    >>> store.classifier
    <spambayes.storage._PersistentClassifier object at
0x12ad230>
    >>> store.classifier.nham
    0
    >>> store.load()
    >>> store.classifier.nham
    0
    >>> store.learn(["hello"], False)
    >>> store.classifier.nham
    1
    >>> ^D
    % spamcounts -o Storage:persistent_use_database:zodb -o
Storage:persistent_storage_file:hammie.db -r hello
    token,nspam,nham,spam prob
    hello,0,1,0.155172413793

Any suggestions for things to try?

Skip
_______________________________________________
SpamBayespython.org
htt
p://mail.python.org/mailman/listinfo/spambayes
Check the FAQ before asking: http://spambayes.sf.
net/faq.html
[1-3]

about | contact  Other archives ( Real Estate discussion Medical topics )