On Mon, 23 Jul 2007 22:29:44 +0200 (CEST)
Pavel Kankovsky wrote:
> On Tue, 17 Jul 2007, Ingomar Wesp wrote:
>
> > For some reason, when manually marking spam or
ham, bogofilter was
> > always called with the -N and -S options
respectively, even if the
> > message was not previously registered at all.
>
> Ugh. Perhaps Bogofilter should provide some protection
against this
> kind of mistake. Would it make sense to complain when a
message that
> has never been registered is being unregistered? (It
would be quite
> easy to implement imho: compute a hash of token list
generated from
> the message, turn it into a quasitoken like .MSG_COUNT,
increment its
> count during registration, check and decrement it
during
> unregistration.)
>
> > I assume that this lead to a condition where the
individual spam
> > count of several tokens were larger than the
overall spam message
> > count.
>
> This is quite likely.
Hi Pavel,
My .MSG_COUNT are approx 550,000 and 140,000. Adding a
"dot"
token for each would add many, many tokens to my wordlist.
As I don't
believe I need them, this seems wasteful. On the other
hand, if you
(or someone else) wants to implement such a capability, it
could be an
option.
When a ham/spam count exceeds .MSG_COUNT it's an indication
that
something is b0rked. Generating an error message might be
appropriate. The idea results in a new issue -- how to make
the
problem known when bogofilter is running in the background.
As a more modest proposal, checking each token's ham and
spam counts
against .MSG_COUNT wouldn't use much computing power and
might be
helpful...
Regards,
David
_______________________________________________
Bogofilter mailing list
Bogofilter bogofilter.org
http://www.bogofilter.org/mailman/listinfo/bogofilter
|