List Info

Thread: General questions.




General questions.
country flaguser name
Japan
2007-04-26 19:03:04
We are considering running Maia as an alternate Mail system
here but 
since it is an unknown piece of software, we feel somewhat
hesitant.

Does Maia Mailguard perform well with 1000s users? 10,000s?
Anyone 
currently running large installations?

How much of a bottle neck is MySQL to the system? What size
does it grow 
to with use?

It appears like it stores a copy of every email in MySQL for
the user to 
be able to tag as spam/ham. All mail? Forever? What if users
never 
bother to login? Is this what the crontab jobs do, clean out
MySQL 
periodically?

Broad-strokes; we have 2 (at start) quad Xeon servers 
(postfix/amavisd/clamav) storing mail to NFS (NetApp) with
separate 
servers for Dovecot and SquirrelMail. Currently using LDAP
for all 
account information.

Any feedback would be appreciated.

Jorgen Lundman

-- 
Jorgen Lundman       | <lundmanlundman.net>
Unix Administrator   | +81 (0)3 -5456-2687 ext 1017 (work)
Shibuya-ku, Tokyo    | +81 (0)90-5578-8500          (cell)
Japan                | +81 (0)3 -3375-1767          (home)
_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users

TROUBLE in child_init_hook?
country flaguser name
United States
2007-04-26 21:44:51
Running Maia 1.0.2 on SLES 10. Worked fine for awhile, then I got these weird error messages (below). I restarted amavis and it went away. Nearly 3 months later, it did it again. I tried looking this error up in the Internet, but I'm not finding much. Just FYI, I'm not using Berkley's database. I'm using MySQL for Maia. Any ideas?
 
Feb  1 10:05:50 LU1-US amavis[24453]: TROUBLE in child_init_hook: BDB no dbS: Unknown locker ID: 12fc2, No such file or directory. at (eval 52) line 25.
Feb&nbsp; 1 10:05:50 LU1-US amavis[24454]: TROUBLE in child_init_hook: BDB no dbS: Unknown locker ID: 12fc3, No such file or directory. at (eval 52) line 25.
Feb&nbsp; 1 10:05:50 LU1-US amavis[24455]: TROUBLE in child_init_hook: BDB no dbS: Unknown locker ID: 12fc4, No such file or directory. at (eval 52) line 25.
Feb&nbsp; 1 10:05:50 LU1-US amavis[24456]: TROUBLE in child_init_hook: BDB no dbS: Unknown locker ID: 12fc5, No such file or directory. at (eval 52) line 25.
Feb&nbsp; 1 10:05:50 LU1-US amavis[24457]: TROUBLE in child_init_hook: BDB no dbS: Unknown locker ID: 12fc6, No such file or directory. at (eval 52) line 25.
Feb&nbsp; 1 10:05:50 LU1-US amavis[24458]: TROUBLE in child_init_hook: BDB no dbS: Unknown locker ID: 12fc7, No such file or directory. at (eval 52) line 25.
Feb&nbsp; 1 10:05:50 LU1-US amavis[24459]: TROUBLE in child_init_hook: BDB no dbS: Unknown locker ID: 12fc8, No such file or directory. at (eval 52) line 25.
Feb&nbsp; 1 10:05:50 LU1-US amavis[24460]: TROUBLE in child_init_hook: BDB no dbS: Unknown locker ID: 12fc9, No such file or directory. at (eval 52) line 25.
&nbsp;
&nbsp;
Howard Yuan
I.T. Department
Valence Technology, Inc.
http://www.valence.com/
Re: TROUBLE in child_init_hook?
country flaguser name
Canada
2007-04-26 22:16:39
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Howard Yuan wrote:
> Running Maia 1.0.2 on SLES 10. Worked fine for awhile,
then I got these
> weird error messages (below). I restarted amavis and it
went away.
> Nearly 3 months later, it did it again. I tried looking
this error up in
> the Internet, but I'm not finding much. Just FYI, I'm
not using
> Berkley's database. I'm using MySQL for Maia. Any
ideas?
>  
> Feb  1 10:05:50 LU1-US amavis[24453]: TROUBLE in
child_init_hook: BDB no
> dbS: Unknown locker ID: 12fc2, No such file or
directory. at (eval 52)
> line 25.

You /are/ using the Berkeley database, but not for the
purposes you're
thinking about.  The message cache, if enabled, makes use of
the
Berkeley database to cache message-IDs of the items Maia has
seen
recently, so as to avoid having to rescan multiple copies of
the same
item.  In practice this is only really useful for sites that
host
mailing lists that chunk up a large broadcast into multiple
copies with,
say, 50 recipients per copy.

You can safely delete the contents of the "db"
subdirectory of your
amavis/maia user's home directory (or wherever you've set
$db_home to in
your amavisd.conf file)--that's where the message cache is
stored.  If
it has become corrupted, it's safe to delete it and let it
get rebuilt
the next time amavisd-maia starts.

That said, the Berkeley database doesn't handle concurrency
very well,
which is why sites with any significant amount of traffic
benefit from
using SQL databases for things like Bayes and quarantines. 
The trouble
is that Berkeley databases are file-based, and so making any
changes to
the database (e.g. a database write operation) requires
locking the
entire file against other write operations, rather than just
the
particular records being changed.  This creates a lot of
lock contention
if multiple processes are waiting to write to the database.

Most sites won't benefit from the message cache in any case,
so
disabling it in your amavisd.conf file is probably in your
best
interest.  Do that by setting $enable_db = 0 and
$enable_global_cache =
0 and restarting amavisd-maia.

- --
Robert LeBlanc <rjlrenaissoft.com>
Renaissoft, Inc.
Maia Mailguard <http://www.maiamail
guard.com/>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGMWsXGmqOER2NHewRAlLoAJ9CgV6EIz71TyrOxsT0Zzhs1qF5JgCd
FM5H
faSQLwQg9zfYHAYTs+fHSro=
=IhPc
-----END PGP SIGNATURE-----
_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users

Re: General questions.
country flaguser name
Canada
2007-04-26 22:37:33
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jorgen Lundman wrote:

> Does Maia Mailguard perform well with 1000s users?
10,000s? Anyone 
> currently running large installations?

A fair number of Maia sites are operating in corporate or
university
environments with tens of thousands of users; I believe the
largest Maia
site I've heard of was using it for about 240,000 users.

Obviously hardware and planning determine the ultimate
scalability of
the system.  Maia is designed to be modular, such that you
can
distribute its components across multiple hosts to increase
its
message-handling capacity and performance.  In particular,
you can run
multiple filtering hosts in parallel, while feeding a
central database,
making it easy to add more filtering hosts as your needs
grow.


> How much of a bottle neck is MySQL to the system? 

In the grand scheme of things, the database is not the most
time-consuming component of the system.  The
performance-impacting
delays you'll see in any mail filter are primarily from the
virus
scanners and the spam-checking tools, particularly if some
of those
tools are doing network lookups and consulting external
databases like
Razor, Pyzor, DCC, and SpamCop.  By contrast, the few extra
milliseconds
it takes to read and write to a well-tuned local SQL
database are
insignificant.

To put it another way, introducing a mail filter into your
mail system
is undoubtedly going to have a performance impact. 
Depending on the
power of your hardware and the thoroughness of the tests you
choose to
apply, your mail filter will add anywhere from 2 to 15
seconds to the
processing time of every email.  You can minimize that to an
extent with
parallelism (i.e. setting up multiple filtering hosts in
parallel), so
"throwing hardware at the problem" /does/ work,
though it comes at a
financial cost.


> What size does it grow to with use?
> 
> It appears like it stores a copy of every email in
MySQL for the user to 
> be able to tag as spam/ham. All mail? Forever? What if
users never 
> bother to login? Is this what the crontab jobs do,
clean out MySQL 
> periodically?

Mail is quarantined/cached for a fixed number of days that
you (as the
super-administrator) can specify.  The
expire-quarantine-cache.pl script
should be run nightly, and handles the expiry of items that
are older
than your expiry threshold.  As a result, your database will
not grow
without bound--it will reach a "high water mark"
and then shrink as
items are expired, growing toward that high point again as
new items arrive.

If you're looking for a way to estimate your storage needs,
a rough
formula might look something like this:

 S = the average size of the mail items you receive
 N = the average number of mail items you receive in a day
 D = the number of days you retain mail before expiry

Given those variables, your "high water mark"
should be approximately
S*N*D.  Allow yourself some margin for growth, of course,
and some room
to store user settings, whitelists/blacklists, and so on.

- --
Robert LeBlanc <rjlrenaissoft.com>
Renaissoft, Inc.
Maia Mailguard <http://www.maiamail
guard.com/>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGMW/9GmqOER2NHewRAmrSAKCbziOtti4WX74CNVNaorcgcXtQUwCf
T1Jw
MiInHzBEqOaq+9m8UseqkRQ=
=MJ9P
-----END PGP SIGNATURE-----
_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users

Re: TROUBLE in child_init_hook?
country flaguser name
United States
2007-04-27 11:30:42
Okie dokie. So if I'm reading correctly, the only benefit
for using the message cache is if I host huge mailing lists
(which I don't), so it'll be better that I disable it. In
disabling it, what really happens? Maia will stop caching
message-ID's...so...what does Maia do with the message-ID's
then? Does it just leaves it in memory as a message is being
passed through and then removed from memory as soon as the
message is done?

What signs or what should I look out for to show me signs of
where I do need the message-ID's cached? Would the server
slow down in processing mail (which I'm guessing it won't
since it no longer needs to write to the hard drive)? What
exactly will happen when I turn off caching of the
message-ID's?

>>> On 4/26/2007 at 8:16 PM, Robert LeBlanc
<rjlrenaissoft.com> wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Howard Yuan wrote:
> Running Maia 1.0.2 on SLES 10. Worked fine for awhile,
then I got these
> weird error messages (below). I restarted amavis and it
went away.
> Nearly 3 months later, it did it again. I tried looking
this error up in
> the Internet, but I'm not finding much. Just FYI, I'm
not using
> Berkley's database. I'm using MySQL for Maia. Any
ideas?
>  
> Feb  1 10:05:50 LU1-US amavis[24453]: TROUBLE in
child_init_hook: BDB no
> dbS: Unknown locker ID: 12fc2, No such file or
directory. at (eval 52)
> line 25.

You /are/ using the Berkeley database, but not for the
purposes you're
thinking about.  The message cache, if enabled, makes use of
the
Berkeley database to cache message-IDs of the items Maia has
seen
recently, so as to avoid having to rescan multiple copies of
the same
item.  In practice this is only really useful for sites that
host
mailing lists that chunk up a large broadcast into multiple
copies with,
say, 50 recipients per copy.

You can safely delete the contents of the "db"
subdirectory of your
amavis/maia user's home directory (or wherever you've set
$db_home to in
your amavisd.conf file)--that's where the message cache is
stored.  If
it has become corrupted, it's safe to delete it and let it
get rebuilt
the next time amavisd-maia starts.

That said, the Berkeley database doesn't handle concurrency
very well,
which is why sites with any significant amount of traffic
benefit from
using SQL databases for things like Bayes and quarantines. 
The trouble
is that Berkeley databases are file-based, and so making any
changes to
the database (e.g. a database write operation) requires
locking the
entire file against other write operations, rather than just
the
particular records being changed.  This creates a lot of
lock contention
if multiple processes are waiting to write to the database.

Most sites won't benefit from the message cache in any case,
so
disabling it in your amavisd.conf file is probably in your
best
interest.  Do that by setting $enable_db = 0 and
$enable_global_cache =
0 and restarting amavisd-maia.

- --
Robert LeBlanc <rjlrenaissoft.com>
Renaissoft, Inc.
Maia Mailguard <http://www.maiamail
guard.com/>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGMWsXGmqOER2NHewRAlLoAJ9CgV6EIz71TyrOxsT0Zzhs1qF5JgCd
FM5H
faSQLwQg9zfYHAYTs+fHSro=
=IhPc
-----END PGP SIGNATURE-----
_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com 
http://www.renaissoft.com/mailman/listinfo/maia-users
-- 

Howard Yuan
I.T. Department
Valence Technology, Inc.
http://www.valence.com/ 

_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users

Re: General questions, part 2.
country flaguser name
Japan
2007-05-08 21:54:32
Thank you Robert and Peter, very encouraging information. I
took the 
time to read the "bigpicture" essay as well to try
to familiarise myself 
  with the whole setup.

With our clusters, we tend to separate the tasks per server.
In this 
case, mail-mx for incoming, mail-smtp for sending, and pop
servers which 
currently run dovecot, and squirrelmail.

I assume I wont have any problems splitting Maia in half,
amavisd on MX 
and the PHP parts on the pop server. Releasing items from
the quarantine 
will use mail-smtp if I set things up correctly?

As suggested in the "bigpicture" document, we
would just runs things 
past clam AV on the mail-smtp servers on outgoing emails,
but is it 
actually suggesting that emails should pass through Maia
completely? 
(Maia picks up that the From address is the customer, and
behaves 
accordingly? Not tagging SPAM headers, but blocking viruses
etc?)

As a throw-on question, is it normal/expected that users
read their 
emails inside of Maia's PHP GUI? It seems that you could,
but we run 
Squirrelmail for that purpose. What is the normal/expected
user "usage"? 
Login one a week (or now and then) to score emails, or every
day/every 
time? Never?

My apologies for such odd questions, but I did not find any
hints to 
them in the FAQ.

Jorgen Lundman.


Robert LeBlanc wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Jorgen Lundman wrote:
> 
>> Does Maia Mailguard perform well with 1000s users?
10,000s? Anyone 
>> currently running large installations?
> 
> A fair number of Maia sites are operating in corporate
or university
> environments with tens of thousands of users; I believe
the largest Maia
> site I've heard of was using it for about 240,000
users.
> 
> Obviously hardware and planning determine the ultimate
scalability of
> the system.  Maia is designed to be modular, such that
you can
> distribute its components across multiple hosts to
increase its
> message-handling capacity and performance.  In
particular, you can run
> multiple filtering hosts in parallel, while feeding a
central database,
> making it easy to add more filtering hosts as your
needs grow.
> 
> 
>> How much of a bottle neck is MySQL to the system? 
> 
> In the grand scheme of things, the database is not the
most
> time-consuming component of the system.  The
performance-impacting
> delays you'll see in any mail filter are primarily from
the virus
> scanners and the spam-checking tools, particularly if
some of those
> tools are doing network lookups and consulting external
databases like
> Razor, Pyzor, DCC, and SpamCop.  By contrast, the few
extra milliseconds
> it takes to read and write to a well-tuned local SQL
database are
> insignificant.
> 
> To put it another way, introducing a mail filter into
your mail system
> is undoubtedly going to have a performance impact. 
Depending on the
> power of your hardware and the thoroughness of the
tests you choose to
> apply, your mail filter will add anywhere from 2 to 15
seconds to the
> processing time of every email.  You can minimize that
to an extent with
> parallelism (i.e. setting up multiple filtering hosts
in parallel), so
> "throwing hardware at the problem" /does/
work, though it comes at a
> financial cost.
> 
> 
>> What size does it grow to with use?
>>
>> It appears like it stores a copy of every email in
MySQL for the user to 
>> be able to tag as spam/ham. All mail? Forever? What
if users never 
>> bother to login? Is this what the crontab jobs do,
clean out MySQL 
>> periodically?
> 
> Mail is quarantined/cached for a fixed number of days
that you (as the
> super-administrator) can specify.  The
expire-quarantine-cache.pl script
> should be run nightly, and handles the expiry of items
that are older
> than your expiry threshold.  As a result, your database
will not grow
> without bound--it will reach a "high water
mark" and then shrink as
> items are expired, growing toward that high point again
as new items arrive.
> 
> If you're looking for a way to estimate your storage
needs, a rough
> formula might look something like this:
> 
>  S = the average size of the mail items you receive
>  N = the average number of mail items you receive in a
day
>  D = the number of days you retain mail before expiry
> 
> Given those variables, your "high water mark"
should be approximately
> S*N*D.  Allow yourself some margin for growth, of
course, and some room
> to store user settings, whitelists/blacklists, and so
on.
> 
> - --
> Robert LeBlanc <rjlrenaissoft.com>
> Renaissoft, Inc.
> Maia Mailguard <http://www.maiamail
guard.com/>
> 
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.6 (GNU/Linux)
> 
>
iD8DBQFGMW/9GmqOER2NHewRAmrSAKCbziOtti4WX74CNVNaorcgcXtQUwCf
T1Jw
> MiInHzBEqOaq+9m8UseqkRQ=
> =MJ9P
> -----END PGP SIGNATURE-----
> _______________________________________________
> Maia-users mailing list
> Maia-usersrenaissoft.com
> http://www.renaissoft.com/mailman/listinfo/maia-users
> 

-- 
Jorgen Lundman       | <lundmanlundman.net>
Unix Administrator   | +81 (0)3 -5456-2687 ext 1017 (work)
Shibuya-ku, Tokyo    | +81 (0)90-5578-8500          (cell)
Japan                | +81 (0)3 -3375-1767          (home)
_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users

Re: General questions, part 2.
country flaguser name
Canada
2007-05-13 23:43:18
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jorgen Lundman wrote:

> I assume I wont have any problems splitting Maia in
half, amavisd on MX 
> and the PHP parts on the pop server. Releasing items
from the quarantine 
> will use mail-smtp if I set things up correctly?

Correct.  Quarantined items are released from the database
server to
your downstream mail server.  The downstream mail server
needs to be
configured to accept connections from the database server,
naturally.
Ordinarily the downstream mail server is configured to
accept mail only
from the database server, so that outside senders can't
bypass your
filters by sending directly to the downstream mail server.


> As suggested in the "bigpicture" document, we
would just runs things 
> past clam AV on the mail-smtp servers on outgoing
emails, but is it 
> actually suggesting that emails should pass through
Maia completely? 
> (Maia picks up that the From address is the customer,
and behaves 
> accordingly? Not tagging SPAM headers, but blocking
viruses etc?)

In most configurations, all mail should pass through Maia,
whether it's
inbound or outbound.  You can, however, configure inbound
and outbound
mail to be filtered differently.  In your case you'd want to
configure
the "." (system-default) account to only filter for
viruses, and
disable all other filters for that account.  You'd want full
filtering
on all of the "domain"-style accounts, however, to
handle the inbound mail.


> As a throw-on question, is it normal/expected that
users read their 
> emails inside of Maia's PHP GUI? It seems that you
could, but we run 
> Squirrelmail for that purpose. What is the
normal/expected user "usage"? 
> Login one a week (or now and then) to score emails, or
every day/every 
> time? Never?

No, Maia is not a webmail client, it is simply a quarantine
management
system.  In normal operation, users login ideally at least
once a day to
manage the contents of their quarantines/caches, which takes
just a few
minutes once they get the hang of it.  The fact that the
quarantines and
caches are score-sorted makes it easy to spot any false
positives and
false negatives, since they're statistically most likely to
be near the
top of the list.

If users neglect their quarantines, the cron-scheduled
expiry script
will eventually delete the quarantined/cached mail, so it's
not the end
of the world.  However, no Bayes-training or spam-reporting
can be
performed on such items (unless you also enable
SpamAssassin's
auto-learning mechanism for the most conservatively-scored
items), so in
large part these neglected items are wasted opportunities. 
Encouraging
your users to participate in the spam-reporting process by
managing
their quarantines on a regular basis will result in a more
effective
Bayes database, which in turn produces fewer false positives
and false
negatives.

- --
Robert LeBlanc <rjlrenaissoft.com>
Renaissoft, Inc.
Maia Mailguard <http://www.maiamail
guard.com/>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGR+jmGmqOER2NHewRApBkAJ4i1etXV3L55L1a65l8g5r2ZEccEACg
pZRZ
Ay/ZG9RQ0MmPVbcvdqZ0jIM=
=2bYl
-----END PGP SIGNATURE-----
_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users

[1-7]

about | contact  Other archives ( Real Estate discussion Medical topics )