Robert LeBlanc wrote:
> While I would still call it "experimental"
at this stage, that's mostly
> because it's being developed very rapidly. The
version I'm using in
> production is the one I describe in the wiki (2.1c),
but there are
> already beta versions in the 2.2 series, and alphas in
the 2.3 series,
> with new experimental releases becoming available at a
rate of one or
> two a day. Clearly this is an area receiving a lot of
attention at the
> moment, and there's a mailing list called
"Devel-Spam"
> <http://lists.own-hero.net/mailman/listinfo/devel-spam&g
t; you can
> subscribe to if you want to keep up with the bleeding
edge of its
> development.
>
> The 2.1 series is quite stable and works quite well for
most purposes.
What does "quite" mean in this context? False
negatives? Crashing
binaries? Stopped mail-delivery? Need for manual
intervention?
> In terms of the extra load and resource usage, it's
minor because of the
> fact that the OCR plugin only gets invoked on mail that
contains inline
> images. For those particular emails, it adds 2-4
seconds of processing
> time, but since those emails represent a very small
fraction of the
> total mail volume, the average increase in processing
time works out a
> few milliseconds per item, or a few (i.e. < 10)
extra processor-minutes
> per day.
>
> The decision to implement OCR in a production
environment at this stage
> is obviously your call, but with the 2.1 stable series
I don't see the
> harm in it, unless perhaps your server is very close to
its resource
> limits as it is.
Ok, this means I have no problems with CPU/RAM ...
> You must also weigh this against the prevalence of
> image-spam, of course; if you haven't been receiving
much of it yet, you
> probably won't feel much pressure to implement OCR.
Once you /do/ start
> receiving it in larger volumes, however, the pressure
may reach a
> tipping point, and you may be willing to accept a bit
more risk and a
> bit more resource consumption in order to stem the tide
of the image-spam.
Correct Exactly
my point of view although I wasn't able to verbalize
in the first mail ... y'know, english isn't my first
language.
> As image-spam becomes more pervasive, however, we're
eventually /all/
> going to need to implement OCR or something equivalent.
When the spam
> content is entirely within the images, and the text
portion of the mail
> contains just non-spammy words and phrases, there's
really very little
> else left for us to do but try to extract the spam
content from the images.
Yup. I think I am gonna head over to your HOWTO and give it
a try. From
what I have seen, this OCR-functionality is switched by that
loadplugin-line in SA, so I can still decide to keep it
turned off per
default as long as I get familiar with it.
Thanks so far, greetings to you, Robert, and Maia
Stefan
_______________________________________________
Maia-users mailing list
Maia-users renaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users
|