List Info

Thread: Re: Postgresql Bayes problem




Re: Postgresql Bayes problem
country flaguser name
Canada
2007-08-10 15:20:57
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Marc G. Fournier wrote:
> 
> ... I ran mysql at the start, figuring that it would be
'faster', since it was 
> mostly READ operations ... 

That may have been true a few years ago with amavisd-new,
before SQL
quarantining became available and commonplace, and before
SpamAssassin
introduced an SQL engine for its Bayes and AWL databases. 
These days an
app like Maia still does a bit more reading than writing,
but it's not
as one-sided as it used to be.

A consequence of this is that out-of-the-box configs for
databases often
strongly favour read-only applications, on the theory that
they'll be
used mainly for lookups and manually updated on occasion by
a human
maintainer.  Obviously such a configuration will not fare
very well with
an application like Maia that does a lot of frequent
automated writes in
addition to the usual lookups.  Understanding this and
making the
necessary configuration tweaks for your database software
(MySQL or
PostgreSQL) is key to achieving good performance with Maia.


> it wasn't the speed that burnt me, it was repeatedly 
> having to rebuild the mysql database after a server
crash, and losing all of my 
> history, and having to re-train the thing ...

No matter what database software you're using, you're
tempting fate if
you don't have some sort of backup mechanism for your data. 
Whether you
use the database's native replication tools to do one-way
replication to
a backup server, or do filesystem mirroring with something
like DRBD, or
use third-party backup software, or even just periodically
rsync a copy
of the tablespace files to another location, it's a part of
life.  If
your database is worth "rebuilding", it's worth
backing up.


> When I say 'unstable', I mean data corruption ...

Yes, many of us have encountered corruption errors with
MySQL at one
time or another.  This was more common with older versions
of MySQL,
which were a lot buggier than the 4.x and 5.x series.  These
days when
it happens it seems to be caused by hardware issues like
faulty RAM,
disk failures, and power outages, rather than by bugs in
MySQL itself.

Recovery in most cases is automatic and straightforward,
with the
uncompleted transactions either rolled back or completed
from the
transaction logs, if MySQL got a chance to write those
before it got
shut down.

More serious corruption problems require more manual
intervention to fix
things, and this is where "knowing your database"
makes the difference
between a five-minute fix and five hours of banging your
head against a
desk.  It's a bit like having to change a flat
tire--something that
every driver should be taught how to do, but most experience
for the
first time on a dark night in the middle of nowhere, trying
to read
instructions on the side of a jack in fine print without a
flashlight.

Most of the people who pick up MySQL or PostgreSQL aren't
trained as
database administrators, and they don't familiarize
themselves with the
"emergency procedures" documents until a crisis
strikes.  Their opinion
of the database software is sharply influenced by how
"painless" the
emergency recovery process goes--if it was intuitive and
the
step-by-step docs were easy to follow, they may well shrug
off the
incident, but if they had to chase down answers in a dozen
forums and
blogs and technical documents in a panic they may very well
emerge from
the incident with a bitter dislike for the database
software.

I'll be the first to admit that MySQL is weak in this
area--the
documentation is organized as a reference manual rather than
as a
tutorial, and that discourages freshman admins from getting
as familiar
with the software as they should.  There's no
conveniently-organized "in
case of emergency, break glass" walk-through, which
would be very handy
for panic-stricken newbies at 2am when they've got no one to
call.  The
information is all there, it's just not organized in a way
that's easy
to find and use.  As a result, most MySQL admins never
configure their
databases beyond the defaults, and those that do rely on
"recipes"
suggested by other users on blogs and forums, rather than
understanding
what exactly they're tweaking and why.  Third-party guides
and books are
definitely recommended to fill this void.


> I've had some clients running mysql finding tables that
just ... disappear ...

I've never heard of that sort of thing happening, but I'm
willing to
accept that it may have happened with the older, buggier
versions (3.x).
 Apart from that, I can only think of education issues--an
inexperienced
admin who converts tables from MyISAM to InnoDB might wonder
where his
table files went, since they would have been rolled into a
big 'ibdata'
tablespace file rather than one file per table (unless the
InnoDB
table-per-file option was specified).

- --
Robert LeBlanc <rjlrenaissoft.com>
Renaissoft, Inc.
Maia Mailguard <http://www.maiamail
guard.com/>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)

iD8DBQFGvMipGmqOER2NHewRAs/AAJ9XOoaqS8jwmnJzH+WfLlcDA62ZJwCf
fMSG
KrZjEtGiRwAUyhm3rJhF/Rc=
=dSaU
-----END PGP SIGNATURE-----
_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users

Re: Postgresql Bayes problem
country flaguser name
Switzerland
2007-08-13 02:55:06
Le vendredi 10 août 2007 à 13:20 -0700, Robert LeBlanc a
écrit :

> > When I say 'unstable', I mean data corruption ...
> 
> Yes, many of us have encountered corruption errors with
MySQL at one
> time or another.  This was more common with older
versions of MySQL,
> which were a lot buggier than the 4.x and 5.x series. 
These days when
> it happens it seems to be caused by hardware issues
like faulty RAM,
> disk failures, and power outages, rather than by bugs
in MySQL itself.


Since old MySQL release was somewhat not "Enterprise
Ready", they have
done a hard job to make it more robust and scalable.

For example, MySQL as totally abandoned their own DB engine
to integrate
some of others, like BerkeleyDB and InnoDB.

But, while working for new features, you will introduce
bugs.

Read that recent story, where they will make Enterprise
version harder
to get, and admit that they released "[weren't] as
robust as we thought,
and created some instabilities."

http://linux.slashdot.org/article.pl?sid=07/08/09/20472
31


> Most of the people who pick up MySQL or PostgreSQL
aren't trained as
> database administrators, and they don't familiarize
themselves with the
> "emergency procedures" documents until a
crisis strikes.  Their opinion
> of the database software is sharply influenced by how
"painless" the
> emergency recovery process goes--if it was intuitive
and the
> step-by-step docs were easy to follow, they may well
shrug off the
> incident, but if they had to chase down answers in a
dozen forums and
> blogs and technical documents in a panic they may very
well emerge from
> the incident with a bitter dislike for the database
software.

That's the most relevant point, far beyond MySQL vs ... war.
Any real database use need a real DBA trained and that can
tune both SQL
requests and the DB itself.

That said and until real changes from MySQL and with my
knowledge, I
will always use PostgreSQL for real heavy load production
sites (I've
got far more support from those guys, code is much better
SQL compliant
and perfs under load are more linear).

Regards


-- 
        Alexandre

_______________________________________________
Maia-users mailing list
Maia-usersrenaissoft.com
http://www.renaissoft.com/mailman/listinfo/maia-users
[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )