|
List Info
Thread: excessive time spent in sql-update?
|
|
| excessive time spent in sql-update? |
  United States |
2007-08-17 15:48:15 |
Hi folks. I've been handholding one of my mail servers
for most
of today as it deals with a large backlog of messages that
are
seemingly the result of a large flood of spam. In the
TIMING log
entries, amavisd-new (v2.4.5) is spending what I consider to
be an
excessive amount of time in "sql-update"...upwards
of nine seconds in
some cases, when the entire run is taking close to twenty
seconds. (!)
In my installation, amavisd-new is logging and
quarantining to an
instance of MySQL running on a dedicated, reasonably quick
database
server located on the local network.
My forte' is C; I'm not much of a Perl hacker, otherwise
I'd dig
in and find out exactly what's happening in that routine.
Can
someone shed a little light on this for me? Specifically,
I'd like
to understand specifically what's going on in that routine
(i.e.,
"it's updating table ABC with information XYZ")
and I'd really like
some suggestions on how to identify the bottleneck so I can
address it.
Does anyone have any thoughts on this?
Thanks,
-Dave
--
Dave McGuire
Port Charlotte, FL
Farewell Ophelia, 9/22/1991 - 7/25/2007
------------------------------------------------------------
-------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and
a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
AMaViS-user mailing list
AMaViS-user lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amav
is.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/ho
wto/
|
|
| Re: excessive time spent in sql-update? |
  United States |
2007-08-17 16:42:55 |
On Aug 17, 2007, at 4:48 PM, Dave McGuire wrote:
> Hi folks. I've been handholding one of my mail
servers for most
> of today as it deals with a large backlog of messages
that are
> seemingly the result of a large flood of spam. In the
TIMING log
> entries, amavisd-new (v2.4.5) is spending what I
consider to be an
> excessive amount of time in
"sql-update"...upwards of nine seconds in
> some cases, when the entire run is taking close to
twenty seconds. (!)
>
> In my installation, amavisd-new is logging and
quarantining to an
> instance of MySQL running on a dedicated, reasonably
quick database
> server located on the local network.
>
> My forte' is C; I'm not much of a Perl hacker,
otherwise I'd dig
> in and find out exactly what's happening in that
routine. Can
> someone shed a little light on this for me?
Specifically, I'd like
> to understand specifically what's going on in that
routine (i.e.,
> "it's updating table ABC with information
XYZ") and I'd really like
> some suggestions on how to identify the bottleneck so I
can address
> it.
>
> Does anyone have any thoughts on this?
Please forgive me for replying to my own message here,
but I'd
like to share some information about some significant
progress I've
just made. I managed to catch one of amavisd-new's SQL
transactions
in MySQL's "show full processlist" output, and it
was executing an
update query on the msgs table with a search field (mail_id)
that
wasn't indexed. I added a new index on that field thusly:
CREATE INDEX msgs_mail_id ON msgs (mail_id);
...and the average (eyeballed) query times have dropped
from 8-15
seconds to less than 100ms, in some cases less than 20ms.
I'm currently digging for the documentation I used to
create those
tables to see if I missed the need for that index.
Can anyone share any other tips here? What other indices
are
people using to improve query performance on their
database-backed
amavisd-new installations?
-Dave
--
Dave McGuire
Port Charlotte, FL
Farewell Ophelia, 9/22/1991 - 7/25/2007
------------------------------------------------------------
-------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and
a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
AMaViS-user mailing list
AMaViS-user lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amav
is.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/ho
wto/
|
|
| Re: excessive time spent in sql-update? |
  Switzerland |
2007-08-17 18:01:08 |
-------- Original-Nachricht --------
Datum: Fri, 17 Aug 2007 17:42:55 -0400
Von: Dave McGuire <mcguire neurotica.com>
An: amavis-user lists.sourceforge.net
Betreff: Re: [AMaViS-user] excessive time spent in
sql-update?
> On Aug 17, 2007, at 4:48 PM, Dave McGuire wrote:
> > Hi folks. I've been handholding one of my mail
servers for most
> > of today as it deals with a large backlog of
messages that are
> > seemingly the result of a large flood of spam. In
the TIMING log
> > entries, amavisd-new (v2.4.5) is spending what I
consider to be an
> > excessive amount of time in
"sql-update"...upwards of nine seconds in
> > some cases, when the entire run is taking close to
twenty seconds. (!)
> >
> > In my installation, amavisd-new is logging and
quarantining to an
> > instance of MySQL running on a dedicated,
reasonably quick database
> > server located on the local network.
> >
> > My forte' is C; I'm not much of a Perl hacker,
otherwise I'd dig
> > in and find out exactly what's happening in that
routine. Can
> > someone shed a little light on this for me?
Specifically, I'd like
> > to understand specifically what's going on in that
routine (i.e.,
> > "it's updating table ABC with information
XYZ") and I'd really like
> > some suggestions on how to identify the bottleneck
so I can address
> > it.
> >
> > Does anyone have any thoughts on this?
>
> Please forgive me for replying to my own message
here, but I'd
> like to share some information about some significant
progress I've
> just made. I managed to catch one of amavisd-new's SQL
transactions
> in MySQL's "show full processlist" output,
and it was executing an
> update query on the msgs table with a search field
(mail_id) that
> wasn't indexed. I added a new index on that field
thusly:
>
> CREATE INDEX msgs_mail_id ON msgs (mail_id);
>
On my setup mail_id is the primary key and MySQL
automatically adds a unique index for the primary key.
Adding another index for mail_id will not do anything bad
except:
- You added a redundant index
- You need more space on the database (for the redundant
index)
- Inserting data to the database will take more time
(maintaining indices is time intensive)
> ...and the average (eyeballed) query times have
dropped from 8-15
> seconds to less than 100ms, in some cases less than
20ms.
>
> I'm currently digging for the documentation I used
to create those
> tables to see if I missed the need for that index.
>
> Can anyone share any other tips here? What other
indices are
> people using to improve query performance on their
database-backed
> amavisd-new installations?
>
> -Dave
>
> --
> Dave McGuire
> Port Charlotte, FL
> Farewell Ophelia, 9/22/1991 - 7/25/2007
>
>
>
>
>
------------------------------------------------------------
-------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems?
Stop.
> Now Search log events and configuration files using
AJAX and a browser.
> Download your FREE copy of Splunk now >> http://get.splunk.com/
> _______________________________________________
> AMaViS-user mailing list
> AMaViS-user lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/amavis-user
> AMaViS-FAQ:http://www.amav
is.org/amavis-faq.php3
> AMaViS-HowTos:http://www.amavis.org/ho
wto/
--
Psssst! Schon vom neuen GMX MultiMessenger gehört?
Der kanns mit allen: http://www.gm
x.net/de/go/multimessenger
------------------------------------------------------------
-------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and
a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
AMaViS-user mailing list
AMaViS-user lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amav
is.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/ho
wto/
|
|
| Re: excessive time spent in sql-update? |
  United States |
2007-08-17 19:12:19 |
On Aug 17, 2007, at 7:01 PM, steeeeeveee gmx.net
wrote:
>> Please forgive me for replying to my own message
here, but I'd
>> like to share some information about some
significant progress I've
>> just made. I managed to catch one of amavisd-new's
SQL transactions
>> in MySQL's "show full processlist"
output, and it was executing an
>> update query on the msgs table with a search field
(mail_id) that
>> wasn't indexed. I added a new index on that field
thusly:
>>
>> CREATE INDEX msgs_mail_id ON msgs (mail_id);
>>
> On my setup mail_id is the primary key and MySQL
automatically adds
> a unique index for the primary key.
>
> Adding another index for mail_id will not do anything
bad except:
> - You added a redundant index
> - You need more space on the database (for the
redundant index)
> - Inserting data to the database will take more time
(maintaining
> indices is time intensive)
Yup, that makes sense to me.
A "show index ..." didn't list an index on that
field, which
suggests that it wasn't defined as a primary key in the
original
CREATE TABLE statement. I double-checked the CREATE TABLE
statements
that I used to set up this database, and sure enough, there
was no
primary key defined.
The flood has died down and my mail server is now keeping
up.
I've also tuned the MySQL server a bit (adjusted some buffer
sizes in
my.cnf) and reduced the query latency a bit further; most
queries are
now executing in less than 10ms.
-Dave
--
Dave McGuire
Port Charlotte, FL
Farewell Ophelia, 9/22/1991 - 7/25/2007
------------------------------------------------------------
-------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and
a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
AMaViS-user mailing list
AMaViS-user lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amav
is.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/ho
wto/
|
|
| Re: excessive time spent in sql-update? |
  Switzerland |
2007-08-17 19:23:13 |
-------- Original-Nachricht --------
Datum: Fri, 17 Aug 2007 20:12:19 -0400
Von: Dave McGuire <mcguire neurotica.com>
An: amavis-user lists.sourceforge.net
Betreff: Re: [AMaViS-user] excessive time spent in
sql-update?
> On Aug 17, 2007, at 7:01 PM, steeeeeveee gmx.net
wrote:
> >> Please forgive me for replying to my own
message here, but I'd
> >> like to share some information about some
significant progress I've
> >> just made. I managed to catch one of
amavisd-new's SQL transactions
> >> in MySQL's "show full processlist"
output, and it was executing an
> >> update query on the msgs table with a search
field (mail_id) that
> >> wasn't indexed. I added a new index on that
field thusly:
> >>
> >> CREATE INDEX msgs_mail_id ON msgs
(mail_id);
> >>
> > On my setup mail_id is the primary key and MySQL
automatically adds
> > a unique index for the primary key.
> >
> > Adding another index for mail_id will not do
anything bad except:
> > - You added a redundant index
> > - You need more space on the database (for the
redundant index)
> > - Inserting data to the database will take more
time (maintaining
> > indices is time intensive)
>
> Yup, that makes sense to me.
>
> A "show index ..." didn't list an index on
that field, which
> suggests that it wasn't defined as a primary key in the
original
> CREATE TABLE statement. I double-checked the CREATE
TABLE statements
> that I used to set up this database, and sure enough,
there was no
> primary key defined.
>
Strange. You should have there a primary key. Anway... can
you check h
ttp://www.ijs.si/software/amavisd/README.sql-mysql and
fix/adjust your table to match the one described in the
document (assuming you run version 2.5.2. If not then search
for the file on your local system). If speed is an issue for
you, then don't create the foreign keys since they slow down
MySQL when purging old data from the AMaViS database.
> The flood has died down and my mail server is now
keeping up.
> I've also tuned the MySQL server a bit (adjusted some
buffer sizes in
> my.cnf) and reduced the query latency a bit further;
most queries are
> now executing in less than 10ms.
>
> -Dave
>
> --
> Dave McGuire
> Port Charlotte, FL
> Farewell Ophelia, 9/22/1991 - 7/25/2007
>
>
>
>
>
------------------------------------------------------------
-------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems?
Stop.
> Now Search log events and configuration files using
AJAX and a browser.
> Download your FREE copy of Splunk now >> http://get.splunk.com/
> _______________________________________________
> AMaViS-user mailing list
> AMaViS-user lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/amavis-user
> AMaViS-FAQ:http://www.amav
is.org/amavis-faq.php3
> AMaViS-HowTos:http://www.amavis.org/ho
wto/
--
Psssst! Schon vom neuen GMX MultiMessenger gehört?
Der kanns mit allen: http://www.gm
x.net/de/go/multimessenger
------------------------------------------------------------
-------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and
a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
AMaViS-user mailing list
AMaViS-user lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amav
is.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/ho
wto/
|
|
| Re: excessive time spent in sql-update? |
  Slovenia |
2007-08-17 19:49:27 |
Giampaolo,
> I guess it's better you don't use the mail_id field as
a primary key: it
> may not be unique in your db. Think, in example, to
messages sent to
> destinations defined as a list or an alias in your
system.
Don't confuse mail_id with a Message-ID header.
mail_id is guaranteed by amavisd to be unique among all
records
currently present in a database.
amavisd-new-2.3.0 release notes:
- introduce a concept of 'mail_id', which is similar to the
am_id as reported
in the log and elsewhere (e.g. 92598-11-5), but has much
stronger long-term
uniqueness property and can be used for the purpose of
uniquely identifying
a quarantined mail, or for other uses. The mail_id is a
12-character string
consisting of characters [A-Za-z0-9+-] (like base64,
except for a '/' being
substituted by a '+'), guaranteed to start and to end with
an alphanumeric
character (i.e. not with '+' or '-'). It is derived by
cryptographically
strong method (MD5), cumulatively collecting entropy
during the life of
child processes, folding-in entropy from processed mail
and other cheaply
accessible sources, collected when an opportunity arises
(e.g. file system
file-IDs, SA results etc), without placing a burden on
system sources of
randomness (see RFC 4086). Note that MD5 has been
demonstrated to have some
weaknesses, but we are not talking about cryptographic
attacks here, but
rather about spreading messages which have no inherent
intention of causing
collisions.
The mail_id carries 71.9 bits of information (subject to
the quality of
sources). For a high-end system that keeps a year's worth
of mail messages
in evidence (e.g. in quarantine) and receives 10e6
messages each day
(20..200 TB of yearly mail contents), the probability of a
mail_id
collision happening during one year (while gradually
displacing an entire
collection with a new set of IDs) is n^2/m = 0.3 %
(10e6 * 365)^2 / (62 * (64^(12-2)) * 62) = (10e6 *
365)^2 / 2^71.9 = 0.003
Eventual clash is still possible and needs to be detected,
but a re-tried
mail delivery attempt is acceptable if its probability is
low, as each mail
processing rolls a dice again. On a smallish system
receiving 10.000 mail
messages daily an 8-character mail_id would suffice, but
the savings are
not worth the trouble of providing configuration
flexibility.
Paired with a mail_id there is its companion secret_id
generated for each
message, such that mail_id can be derived from secret_id
and pairing checked,
but not the other way around. The purpose of secret_id is
not yet fully
developed, but can serve as a 'ticket', granting user a
right to release
a quarantined message addressed to him.
Mark
------------------------------------------------------------
-------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and
a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
AMaViS-user mailing list
AMaViS-user lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amav
is.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/ho
wto/
|
|
[1-6]
|
|