List Info

Thread: musings on performance




musings on performance
country flaguser name
United States
2007-02-14 12:51:57
Maybe maia-users isn't the right place to post this, but I
just have
tremendous pressure building to somehow get the thoughts
I've been
having available to somebody who might care, or especially
might
point out what stupid mistake I've made.

First, let me list a few thoughts about how I used to do my
own
personal spam filtering.

I first thought it might be worthwhile to do any spam
filtering at
all when there was mention on slashdot of Paul Graham's
ideas about
using Bayesian statistics.  My saved lynx bookmark says that
was
August 16, 2002.  There was another slashdot article on
November 3,
2002 about Graham's POPFile, which used Perl to run a
localhost POP
server that had filtered messages from the ISP mail server. 
I guess
for this audience we don't need a glossary to say what POP,
Perl and
ISP mean???  I didn't use POP, I used a Unix mbox, so I
never used
Graham's software directly myself.  According to the
slashdot bookmarks
around it, I first went to sourceforge for Eric S. Raymond's
bogofilter
between November 22 and 24, 2002.  I have about 10
spam-filter-related
bookmarks in that two-day period.

So I started using bogofilter in late 2002, and it was still
my sole
filter as of May 2006.  It got about 2 or 3 false positives
per year
and 2 or 3 false negatives per day over that interval, for a
mail load
of a few hundred ham messages per day, and the spam rate
went from
about 60 per day in late 2002 to nearly 300 per day in
August through
November 2004 (a peak 50% higher than almost any other
month) and
stayed over 100 a day until late 2005 and tapered down to
about 80
per day into 2006.  In May 2006 I installed spamassassin on
my desktop
Gentoo Linux and modified my filter script to run SA on
whatever
bogofilter considered ham, and that got the false negatives
down
to 4 or 5 per week.  I thought that was pretty good.

By "my filter script" I mean I actually
incorporated all messages
into my MH (Mail Handler, Marshall Rose's Rand Corporation
mail
client system) inbox.  Then I would run a script that would
run
bogofilter on each inbox message and refile the spam to a
spam
sub-sub-folder (sub-folder covers a year, sub-sub-folder
covers
a month.)  And manually refile whatever spams remained there
also,
and retrieve the occasional ham from it back to inbox.

I noticed that when a message went to SA, that SA took a lot
longer
to analyze it than bogofilter did.  Maybe 2 or 3 seconds
instead of
2 or 3 messages per second.  This on a 2 GHz AMD XP 3500 --
probably
disk speed didn't matter much, but it is an IDE, so that's
relatively
slow in modern disk technology.

And since I hadn't found Maia and all the terrific advice
from Robert
and David, and others, I didn't know to use sa-learn daily
to update
SA rules, or even what rules SA was using (it's just
whatever gets
installed, and maybe updated if gentoo "emerge
world" happens to
have a new SA with updated rules when I do that at 6-12 week
intervals.)
The SARE rules from openprotect seem to do a great job at
eliminating
those last few false negatives.  And using Bayesian with SA
helps
quite a bit, too.  Although, if SA didn't even see what the
bogofilter
Bayesian could pre-filter, wow, lots of time would be saved
poring
through rules that really didn't matter anyway.  Well, at
least that's
what I'm thinking.

In August 2006 I started using the postfix-amavis-SA-clamav
suite for
other users, and I knew they were going to take more
resources than the
700 MHz AMD K7 with 40 GB IDE drive computer I was using for
that e-mail
system could provide.  I thought a 3+ GHz 64-bit Pentium
with SATA
would be more than adequate, though.  And, without Maia, it
was, as
slow as SA is.  However, I still hadn't discovered sa-learn
or
openprotect and SARE, or how to make Bayesian and AWL do
anything
intelligent, so the efficiency was still horrid.  It was
doing
quarantining, but there were no utilities to show what was
quarantined for each user, or to allow users to release
false
positives from the quarantine, or to report anything to
spamcop,
spamhaus, or anywhere.  It did look like a good SA threshold
for
most users was about 2.9, although, unlike bogofilter, that
still
had way too high a rate of false positives.  But the default
5.0
had even more false negatives.  There wasn't really a
performance
problem, though, on the new hardware.  I had considered
getting
dual processors and using mirroring for reliability with two
SATA
drives, but not striping for performance, but the system
priced
out triple the one I chose.

Because of the quarantine problem, I began looking for what
I'd call
a "quarantine manager."  And pretty much the only
one I found at all
was Maia Mailguard.  That was about November 2006.  I looked
for about
6 weeks before I found Maia.  And the way Maia is supposed
to work
seems to me just about ideal, although the users still find
it more
complex and confusing than they'd like.  But the problem is,
what
happened to the speed?

Sigh.

It was snappy for awhile.  Even after a few thousand
messages had
passed through, although it was getting a little sluggish,
it still
brought up web pages in under 5 seconds, and the postfix
queue was
never more than a handful of messages.  But by the time it
had gone
through a few tens of thousands of messages, although the
current
store was still under 2,000 messages, the web pages were
taking
30 seconds and up.  Now that it's pushing 80,000 messages,
hardly
any web page ever comes up in less than a minute, the logs
are
getting TIMED OUT messages, the queue has frequently been
dozens
of messages dating back an hour or more, and once was about
1,200
that were up to 36 hours since initial delivery.  That time
I had
to manually remove files from the /var/spool/postfix/*
directories,
it was just spiraling into oblivion.

When I run psql and do simple select statements, what should
be
practically instant response is 30 seconds and up.  The
"ms" entries
in the log are up to several minutes, very rarely only 10
seconds
or so.  The "top" page says postgres is in biowai,
perl is in biowai.
When I use the web interface to confirm/report/whatever, if
I restart
postgres first, it can be pretty snappy, until the cron perl
scripts
kick in, then it's deadly slow again.  And restarting
postgres just
creates a backlog of incoming messages to process, too, so
it's not
really a good thing to do.  It seems to corrupt the data,
too, making
entries in the spam/ham lists show up with nothing but
"no subject"
and the welcome page count of spam disagrees with the actual
list
(it says there are 35, but clicking on the report button
just drops
back to the welcome page because it can't find any to list,
for
instance.)  I used the rc.shutdown script to stop postgres,
and
the rc.local one to start it back up, yet something bad
happened,
it appears, though not really causing any problem more
serious than
counts that disagree, as far as I know.

So, I'm thinking, there's something wrong here.  This
computer should
be about to handle thousands of messages per day, but 950 or
so is
just crippling it to the point where nobody would tolerate
how slow
it responds.  Oh, "top" also says the CPU idle is
90% or more, though
the load is sometimes around 4.  I don't see how
multi-processors
would make this better.  Maybe it's I/O bound.  But, then,
"top" says
I have 400 MB of free RAM, and I've tried to raise some of
the small
postgres server defaults so postgres will get smarter about
caching
data in RAM instead of disk, but it just keeps getting
slower.
There's very little more total I/O required than without
Maia, certainly
not double.  Maybe using sa-learn to add new rules has
increased SA
processing time somewhat, but going from a few seconds to
several
minutes of wall clock time per message seems outrageous to
me.

I've tried to read up on postgres tuning.  I did discover
some things
like raising the default 1024 in work_mem (I'm trying 16384
currently)
and 16384 maintenance_work_mem (131072.)  And the shm and
sem kernel
sizes, and using ipcs to check what's happening there.  I
enabled
the stat lines in /var/postgresql/data/postgresql.conf but I
haven't
seen much about how to see what the stats are.  I did find a
perl
script named pgtop (not easily, since "i /pgtop/"
in my cpan doesn't
find it at cpan.utah.edu, but it is at cpan.pair.com.)  But
that
doesn't tell me much, either.  I think there are counts of
operations
and maybe something about timing in there, but I have no
clue how to
ferret it out.  In phpPgAdmin, I see only "public"
functions, but in
psql, if I do "df" I see lots of pg_catalog
(that's a schema) functions,
including many that begin "pg_stat_" but I have no
clue how to use
those.  Maybe if I study the pgtop source, there's a clue
there.
The postgresql manual seems murky, although it has several
pages
about stats.

I'm still using Berkeley DB for SA Bayesian and AWL, because
on
the system where I tried moving them to postgres, the
system
actually got *slower* (a *lot* *slower*) and I had to
revert
to Berkeley DB so it would keep up with incoming messages. 
Of
course, on that system, there's no free RAM and it's using
as
much swap as RAM, so maybe the experience would be opposite
on the faster system.  I'm just plain afraid to try it! 
Plain
old fear.

So I'm kind of stymied again.  This is not going to hack it
for
production, though, that's obvious, and I'm thinking it's
not because
I don't have a 4-way CPU and 4-way stripe on my disks.  You
guys
are talking about heartbeat and DRDB for redundancy, and I'm
seeing
horrendous resource consumption.  Why are we not only not on
the
same page, we're not in the same book, maybe not the same
library??

I think it would be faster to do the bogofilter first, SA
only on
what bogofilter thinks is ham, though I'm not sure how that
would
translate into scores that SA itemizes so well.  SA in C
instead of
Perl might help by a factor of 10 or more.  The Maia scripts
in
C instead of Perl might be somewhat faster.  The Maia web
site
in C instead of PHP would be faster, but maybe that's
dwarfed by
the rest of the slowness.  But somehow there's resource
consumption
here that just is orders of magnitude worse than it should
be.
Maybe I just haven't found the right postgres conf item to
tweak,
but it shouldn't be this difficult to find how to make it
faster,
and it just shouldn't be this slow anyway.

I think postgres is probably at the heart of this, though I
may
be wrong, since I don't really have anything but
"top" and how
long "select ..." in psql takes to judge postgres'
performance.
And it could be that the perl scripts are dominating the
I/O
system so much that postgres can't get its foot in the door
(although I'd think those scripts use postgres quite a lot,
too.)

This reminds me of both Java and bittorrent.  I've tried
both, a little.
They're both way too slow to use for anything useful, in my
opinion,
although I admit I haven't tested them all that thoroughly. 
I do see
that plenty of other people do share my opinion of Java,
however.  Yet
lots of people swear by them.  Are those people just finding
some magic
potion I never see, are they tolerant of abysmal
performance, am I
particularly stupid, what's going on here?


_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users

Re: musings on performance
country flaguser name
United States
2007-02-14 13:28:00
On Wed, 2007-02-14 at 11:51 -0700, Dale Carstensen wrote:

> I think postgres is probably at the heart of this,
though I may
> be wrong, since I don't really have anything but
"top" and how
> long "select ..." in psql takes to judge
postgres' performance.

I think you are on the right track here. I had similar
problems with
MySQL, and when I increased the InnoDB buffer size,
performance improved
dramatically. I suspect Postgres may also be memory-starved.
How much
RAM do you have on the machine running Postgres? If you use
"top" and
hit the M key (to sort by amount of memory used), how much
RAM is
Postgres using? I'd say you either don't have enough RAM on
your system,
or Postgres isn't making use of what you do have. That is
certainly the
problem I was having with MySQL.

--Greg


_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users

Re: musings on performance
country flaguser name
Australia
2007-02-14 15:31:34
On Wed, 2007-02-14 at 11:51 -0700, Dale Carstensen wrote:
> Maybe maia-users isn't the right place to post this,
but I just have
> tremendous pressure building to somehow get the
thoughts I've been
> having available to somebody who might care, or
especially might
> point out what stupid mistake I've made.
> 
<lots of stuff snipped>

Just as a data point we have a small install processing
between 5,000
and 10,000 messages a day. Maia runs on a Single AMD Opteron
252
processor, has a 2 drive SATA II mirror and 2 GB RAM. If it
makes any
difference the server run Redhat Enterprise.

Load hovers at around 0.2 and occasionally peaks at about
3.5 when
there's a flurry of activity. 

The separate MySQL database server is the same hardware with
slightly
higher load.

Maia web pages render quickly regardless of the number of
mails in
quarantine and SA processing time ranges from 3,000 ms to
13,000 ms.

While I can't help you with postgres tuning I can agree with
other
previous comments that database tuning, and in our case
memory and key
usage tuning made the most amount of difference.

-- 
Karl Latiss <karl.latissatvert.com.au>
Atvert Systems
_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users

Re: musings on performance
country flaguser name
United States
2007-02-14 21:19:13
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Feb 14, 2007, at 12:51 PM, Dale Carstensen wrote:
> I noticed that when a message went to SA, that SA took
a lot longer
> to analyze it than bogofilter did.  Maybe 2 or 3
seconds instead of
> 2 or 3 messages per second.  This on a 2 GHz AMD XP
3500 -- probably
> disk speed didn't matter much, but it is an IDE, so
that's relatively
> slow in modern disk technology.
>

It's possible that the way you were calling spamassassin was
spawning  
a new process for every message; there's a heavy amount of 

initialization for spamassassin, and that's what makes
amavisd-*  
better - it only initializes once.




> or so.  The "top" page says postgres is in
biowai, perl is in biowai.
> When I use the web interface to
confirm/report/whatever, if I restart
> postgres first, it can be pretty snappy, until the cron
perl scripts
> kick in, then it's deadly slow again.  And restarting
postgres just

This is indicating that the system is waiting on hard drives
to  
provide data.  The question then is why...

> creates a backlog of incoming messages to process, too,
so it's not
> really a good thing to do.  It seems to corrupt the
data, too, making
> entries in the spam/ham lists show up with nothing but
"no subject"
> and the welcome page count of spam disagrees with the
actual list
> (it says there are 35, but clicking on the report
button just drops
> back to the welcome page because it can't find any to
list, for
> instance.)  I used the rc.shutdown script to stop
postgres, and
> the rc.local one to start it back up, yet something bad
happened,
> it appears, though not really causing any problem more
serious than
> counts that disagree, as far as I know.

Corruption isn't good, but it shouldn't be the cause of the
problems,  
just a symptom.  The maintenance scripts should clean it
out.

You *are* running the process-quarantine and
expire-quarantine-cache  
scripts, right?

>
> So, I'm thinking, there's something wrong here.  This
computer should
> be about to handle thousands of messages per day, but
950 or so is
> just crippling it to the point where nobody would
tolerate how slow
> it responds.  Oh, "top" also says the CPU
idle is 90% or more, though
> the load is sometimes around 4.  I don't see how
multi-processors
> would make this better.  Maybe it's I/O bound.  But,
then, "top" says
>

Yes, it looks IO bound


> I have 400 MB of free RAM, and I've tried to raise some
of the small
> postgres server defaults so postgres will get smarter
about caching
> data in RAM instead of disk, but it just keeps getting
slower.
> There's very little more total I/O required than
without Maia,  
> certainly
> not double.  Maybe using sa-learn to add new rules has
increased SA
> processing time somewhat, but going from a few seconds
to several
> minutes of wall clock time per message seems outrageous
to me.
>

sa-learn doesn't add rules, it trains the bayes (which the
process- 
quarantine script does for you)


> I've tried to read up on postgres tuning.  I did
discover some things
> like raising the default 1024 in work_mem (I'm trying
16384 currently)
> and 16384 maintenance_work_mem (131072.)  And the shm
and sem kernel
> sizes, and using ipcs to check what's happening there. 
I enabled
> the stat lines in /var/postgresql/data/postgresql.conf
but I haven't

Unfortunately, I know nothing about postgresql tuning.  I
think in  
general we've had slightly better reports under mysql, but
we've also  
had a handful of sluggish mysql servers too.
>
> I'm still using Berkeley DB for SA Bayesian and AWL,
because on
> the system where I tried moving them to postgres, the
system
> actually got *slower* (a *lot* *slower*) and I had to
revert

Ack!  the berkeley stuff is very slow, because it locks the
whole  
file and prevents parallel usage.  When using pgsql for
Bayes, be  
sure to use the Postgresql specific Bayes storage engine, as
it is  
quite faster than the generic one.

> to Berkeley DB so it would keep up with incoming
messages.  Of
> course, on that system, there's no free RAM and it's
using as
> much swap as RAM, so maybe the experience would be
opposite
> on the faster system.  I'm just plain afraid to try it!
 Plain
> old fear.

Heck, for kicks, try turning off bayes and awl in your
local.cf  
file...  just for comparison.


> translate into scores that SA itemizes so well.  SA in
C instead of
> Perl might help by a factor of 10 or more.  The Maia
scripts in
> C instead of Perl might be somewhat faster.  The Maia
web site
> in C instead of PHP would be faster, but maybe that's
dwarfed by
> the rest of the slowness.  But somehow there's resource
consumption

No, this is just a myth, there isn't really going to be much
if any  
change by using C.  You're IO bound, not CPU bound.  Even if
it were  
a CPU problem, C wouldn't do much better...  perl is champ
at regular  
expressions, and SA uses a lot of them.


> here that just is orders of magnitude worse than it
should be.
> Maybe I just haven't found the right postgres conf item
to tweak,
> but it shouldn't be this difficult to find how to make
it faster,
> and it just shouldn't be this slow anyway.
>
> I think postgres is probably at the heart of this,
though I may
> be wrong, since I don't really have anything but
"top" and how
> long "select ..." in psql takes to judge
postgres' performance.
> And it could be that the perl scripts are dominating
the I/O
> system so much that postgres can't get its foot in the
door
> (although I'd think those scripts use postgres quite a
lot, too.)
>

I'm leaning this direction too.   More RAM in the box may
help.    
Giving the database more ram to work with might help.


> This reminds me of both Java and bittorrent.  I've
tried both, a  
> little.
> They're both way too slow to use for anything useful,
in my opinion,

Java, I've had mixed results, but bitorrent usually flies.
;)


Other details you haven't provided:  What version of Maia? 
1.0.2 had  
a few sql tweaks in the web pages to make it faster...


David Morton
Maia Mailguard http://www.maiamailguard
.com
mortondadgrmm.net



-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iD8DBQFF09E0Uy30ODPkzl0RAreMAJ4wVb6QppSyMKuA9k+W9glTWHB/bgCg
zXH3
1tDNuclGgYDgiZlgXVasah0=
=EsKu
-----END PGP SIGNATURE-----
_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users

Re: musings on performance
country flaguser name
Canada
2007-02-14 22:10:06
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

A few general points to add to this discussion:

Maia is fundamentally a database application, not a mail
app.  We
describe it as a mail filter, but in a functional sense it's
all about
storing and manipulating data in a SQL database. 
Amavisd-maia receives
mail and stores it in a database, after consulting the
database for
everything from user preferences and whitelists/blacklists
to Bayes and
AWL lookups and additions.  The Perl maintenance scripts
process the
mail items stored in the database, submitting them for Bayes
training
and/or reporting, expiring them as they become outdated, and
producing
digests from that data.  The PHP scripts provide a web GUI
to let
administrators and end-users view and manage the data stored
in the
database.

There can be no question, then, that database performance is
crucial to
a Maia installation.  If your database doesn't have the
resources it
needs in order to perform well, performance across the board
is going to
be poor.  Amavisd-maia will take longer to scan mail items.
SpamAssassin will time out waiting for Bayes reads and
writes to
complete.  The Perl maintenance scripts will slow your
system to a
crawl.  The Web GUI will take ages to load and refresh
database-generated pages.

If you really want to improve the performance of your Maia
installation,
focus your efforts first on making the database happy.  It
may need more
RAM, it may need more disks to stripe its data across, it
may need more
processor cycles to work with--or it might simply need to be
tuned
better to suit the resources you already have.

Lest this degenerate into a PostgreSQL vs. MySQL flamewar,
though, let's
be clear that both of these databases can do the job for
sites large and
small.  What matters, ultimately, is how knowledgeable you
are about
ways to tune your database software of choice, in order to
get the most
out of it.  You'll find that most of the people subscribed
to this list
are MySQL users, and much has been said on the topic of
MySQL tuning
over the years here.  The PostgreSQL users are fewer in
number, here,
but a few (e.g. Marc Fournier, Alexandre Ghisoli, etc.) have
spoken up
from time to time to offer some useful advice on the
subject.  I'm not
trying to talk you into switching to MySQL, but I /will/
admit that your
choice of PostgreSQL limits the availability of useful
troubleshooting
advice on this list.  You may need to look to
PostgreSQL-specific lists
for more help with tuning questions.

- From a hardware standpoint, I agree with David that you
seem to be I/O
bound, which is a common enough scenario with databases. 
This is where
striping the data across several disks greatly improves
your
performance, since you make more drive heads available for
reading and
writing the data.  It's just like adding more cashiers at a
supermarket,
in terms of the effect on shrinking long customer queues. 
If you're
concerned about reliability as well as performance, consider
something
like RAID 5, or RAID 1+0.

More RAM can also help in a variety of ways.  The most
obvious impact is
seen in terms of the processes that can remain in physical
memory rather
than living in the swap space.  Apart from processes that
are
effectively "sleeping", you really don't want
anything to be "active" in
the swap space.  Using hard drives as surrogate memory is
like driving
on a half-sized spare tire when you get a flat--it's an
emergency
measure designed to get you to the nearest service station,
nothing
more.  If swapping prevents an all-out system crash during
an occasional
surge of system activity, it's served its purpose, but if
it's doing
that sort of thing on a regular basis you need more physical
RAM.

A less-obvious use of RAM, incidentally, is as an I/O buffer
for DMA.
It's right there in the name--Direct Memory Access--but we
rarely
consider the implications.  With DMA enabled for your hard
drives and
CD/DVD drives and other peripherals, the system uses some of
its RAM to
speed up the transfer of data to and from these devices,
instead of
having to bother the CPU.  For instance, this lets you copy
a 700 MB
file from one hard drive to another without locking up your
CPU for five
minutes.  The unexpected downside, of course, is that
several hundred MB
of your system RAM may be eaten up as a buffer during the
transfer, and
that can starve some of your other processes of available
RAM.  Having
more RAM than you strictly /need/ means that all of your
file
copies/moves/deletes that rely on DMA happen faster, because
there's
plenty of buffer memory available at all times.  This makes
your system
feel faster and more responsive.

Another point that you'll see mentioned in a lot of the
performance-related threads in the list archives is that
distributing
your Maia installation across several machines is a good way
to harness
the power of older or less-capable hardware.  If you can't
afford to buy
a real powerhouse server to run everything you need,
consider two
servers half as powerful, or four servers a quarter as
capable.  By
doing so, you're effectively subdividing the load across
multiple
processors, more total RAM, and more hard drives--a
"poor-man's cluster"
can be quite effective if it's laid out properly.  Put the
database
server on the fastest machine you've got, and let that be
its only job.
 Choose another machine to be your dedicated web server. 
Then you can
use a third machine as your
Postfix/ClamAV/SpamAssassin/amavisd-maia
box, and clone that box as many times as you need to in
order to achieve
the throughput levels you're after.

- --
Robert LeBlanc <rjlrenaissoft.com>
Renaissoft, Inc.
Maia Mailguard <http://www.maiamail
guard.com/>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFF090eGmqOER2NHewRAsntAKCt+W5uKHyNlVaUEiRjJ8wmE9HVsACa
Axsn
Gc3oIlcD6pL2G8a5jpBqe14=
=ZMzO
-----END PGP SIGNATURE-----
_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users

Re: musings on performance
country flaguser name
United States
2007-02-15 10:40:38
> Maia is fundamentally a database application, not a
mail app.  We
> describe it as a mail filter, but in a functional sense
it's all about
> storing and manipulating data in a SQL database. 
Amavisd-maia receives
> mail and stores it in a database, after consulting the
database for
> everything from user preferences and
whitelists/blacklists to Bayes and
> AWL lookups and additions. 

  Maia has very similar goals/requirements to dbmail, and
I've often
wondered if maia could benefit from changing its message
store to dbmail
format.  I think the real potential advantages would be in
integrating
the two, but it may add considerable complexity to maia to
both support
the current "pass through" filtering mode and a
"scanning plus final
delivery" mode the integration would provide, though
one solution may be
having two amavisd-maia scripts, one specifically for
dbmail.  There may
be other advantages though, in performance (if say maia
could benefit
from cached message headers).  Robert: if interested and you
have a few
minutes, take a look at the dbmail schema some time; the
project would
add some tools to the mix, if say you wanted an imap server
interface to
quarantine management or something.  (It looks like their
websvn is
broken right now, but you could grab
http://www.dbmail.org/download/2.2/dbmail-2.2.2.tar.gz
and look under
the "sql/" subdirectory.)


> You may need to look to PostgreSQL-specific lists
> for more help with tuning questions.

  Take a look at the dbmail mailing lists for some postgres
tips;
there's been considerable discussion there at times.


-- 
Jesse Norell - jessekci.net
Kentec Communications, Inc.

_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users

Re: musings on performance
user name
2007-02-20 02:12:23
Le mercredi 14 février 2007 à 11:51 -0700, Dale Carstensen
a écrit :

> I've tried to read up on postgres tuning.  I did
discover some things
> like raising the default 1024 in work_mem (I'm trying
16384 currently)
> and 16384 maintenance_work_mem (131072.)  And the shm
and sem kernel
> sizes, and using ipcs to check what's happening there. 
I enabled
> the stat lines in /var/postgresql/data/postgresql.conf
but I haven't
> seen much about how to see what the stats are.  I did
find a perl
> script named pgtop (not easily, since "i
/pgtop/" in my cpan doesn't
> find it at cpan.utah.edu, but it is at cpan.pair.com.) 
But that
> doesn't tell me much, either.  I think there are counts
of operations
> and maybe something about timing in there, but I have
no clue how to
> ferret it out.  In phpPgAdmin, I see only
"public" functions, but in
> psql, if I do "df" I see lots of pg_catalog
(that's a schema) functions,
> including many that begin "pg_stat_" but I
have no clue how to use
> those.  Maybe if I study the pgtop source, there's a
clue there.
> The postgresql manual seems murky, although it has
several pages
> about stats.

PostgreSQL is a DB system that need to be tuned to fit in
your hardware.
As I can read, try to disable flush data into disk, because
your I/O
seems to be limiting.

Edit your postgresql.conf and uncomment this :
fsync = off

Hope this helps.

-- 
        Alexandre

_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users
[1-7]

about | contact  Other archives ( Real Estate discussion Medical topics )