|
List Info
Thread: Re: Why do we avoid auto-incrementing db columns?
|
|
| Re: Why do we avoid auto-incrementing
db columns? |
  United States |
2007-04-25 12:59:45 |
Hi Sammy,
There's only on thing I disagree on; the performance hit of
the mutex lock.
If you do not have a relatively large number of insertions,
than this is not
a large performance hit; however, if you do, this does
become a performance
hit. I admit, I'm on the extreme end of the spectrum as I
was inserting some
20 million nodes (which, expanded means 20 million nodes, 20
million
node_revisions, and 7-14 million other insertions in diff.
tables for the
application).
After implementing random-id generation with out locking the
table ever at
all, the performance increase was very noticeable, on the
order to 70-100%
more throughput of node creation.
I would think removing the bottle neck though would benefit
sites that
probably have a lot of comments being added and content, as
there is no
contention for the sequences table anymore.
-----Original Message-----
From: development-bounces drupal.org
[mailto:development-bounces drupal.org]
On Behalf Of Sammy Spets
Sent: Wednesday, April 25, 2007 12:54 PM
To: development drupal.org
Subject: Re: [development] Why do we avoid auto-incrementing
db columns?
Hi everyone,
I'm PostgreSQL maintainer for Drupal and since people are up
in the air
about what is going on down in the DB layer re autoincrement
and serial
fields I figured i'd shed some light. Many people have
already stated
this stuff in earlier mails and hopefully getting the info
in one big
pile will make it more concrete.
PostgreSQL does implement atomic sequencing with the serial
and
bigserial types. MySQL does the same with the auto_increment
field
property. These are used in Drupal to produce a guaranteed
unique ID
for tables using them. Implementing this in PHP will
certainly generate
non-unique IDs unless you lock tables thus introducing a
massive
performance penalty.
We use db_next_id() in Drupal to provide a db-independent
method of
getting the next unique ID into the application for use just
after the
insert operation on the DB. The performance hit of using a
sequence is
next to nothing considering all that's involved is a mutex
lock, read,
increment and mutex unlock. Making this more efficient is
simply not
possible.
Sure there are two different use cases for insertions: those
where you
need the ID afterwards (e.g node) and those you don't (e.g
watchdog).
If you want to write code without the use of db_next_id()
then do an
INSERT and omit the id column. I don't see anyone wanting to
spend
effort patching Drupal core to gain 0.01% efficiency across
the
application.
/me turns his torch off
--
Sammy Spets
Synerger Pty Ltd
http://synerger.com
On 25-Apr-07 10:15, Barry Jaspan wrote:
>
> Could someone either clue me in or point to a prior
discussion about
> why Drupal avoids using auto-incrementing database
columns? If it is
> database compatibility, which versions of which
databases that we
> support do not have it?
>
> Thanks,
>
> Barry
>
|
|
| Re: Why do we avoid auto-incrementing
db columns? |
  Germany |
2007-04-25 12:11:36 |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Earnest Berry III schrieb:
> Hi Sammy,
>
> There's only on thing I disagree on; the performance
hit of the mutex lock.
> If you do not have a relatively large number of
insertions, than this is not
> a large performance hit; however, if you do, this does
become a performance
> hit. I admit, I'm on the extreme end of the spectrum as
I was inserting some
> 20 million nodes (which, expanded means 20 million
nodes, 20 million
> node_revisions, and 7-14 million other insertions in
diff. tables for the
> application).
>
> After implementing random-id generation with out
locking the table ever at
> all, the performance increase was very noticeable, on
the order to 70-100%
> more throughput of node creation.
http://drupal.org/node/5
5516
You'll find this interesting.
Cheers,
Gerhard
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
iD8DBQFGL4vHfg6TFvELooQRAjpKAKCBPK+yW2CfpFGdTxmYyrFJdoL8egCf
enj8
vR8BKlJxyTdTyb0Aip1T1AU=
=HooF
-----END PGP SIGNATURE-----
|
|
| Re: Why do we avoid auto-incrementing
db columns? |

|
2007-04-25 13:58:54 |
|
Stored procedures would be nice to have in some cases, but remember that many people use drupal for small sites hosted on shared servers.
As far as I know you can't create mysql stored procedures via PHP (try that in phpmyadmin), so unless you have access to the database server via ssh or some other non-php interface, you won't be able to create any sp's.
Florian
|
| Re: Why do we avoid auto-incrementing
db columns? |
  Australia |
2007-04-30 01:25:02 |
Earnest,
Insertion of 20 million nodes in a short space of time
implies 10
million users are inserting 2 nodes each in a short space of
time. It
also implies many other combinations. Looking at the
plausibility that
this situation become true.... pretty much no chance you'll
get that
many insertions in the same table at the same time except
for the
watchdog table.
Sure, you've found a way to improve performance for an
unrealistic
boundary case and that's fantastic. Thanks for working it
out!
How does your technique of randomly allocating IDs avoid
duplicate ID
allocation?
Cheers,
--
Sammy Spets
Synerger Pty Ltd
http://synerger.com
On 25-Apr-07 13:59, Earnest Berry III wrote:
> Hi Sammy,
>
> There's only on thing I disagree on; the performance
hit of the mutex lock.
> If you do not have a relatively large number of
insertions, than this is not
> a large performance hit; however, if you do, this does
become a performance
> hit. I admit, I'm on the extreme end of the spectrum as
I was inserting some
> 20 million nodes (which, expanded means 20 million
nodes, 20 million
> node_revisions, and 7-14 million other insertions in
diff. tables for the
> application).
>
> After implementing random-id generation with out
locking the table ever at
> all, the performance increase was very noticeable, on
the order to 70-100%
> more throughput of node creation.
>
> I would think removing the bottle neck though would
benefit sites that
> probably have a lot of comments being added and
content, as there is no
> contention for the sequences table anymore.
>
>
>
> -----Original Message-----
> From: development-bounces drupal.org
[mailto:development-bounces drupal.org]
> On Behalf Of Sammy Spets
> Sent: Wednesday, April 25, 2007 12:54 PM
> To: development drupal.org
> Subject: Re: [development] Why do we avoid
auto-incrementing db columns?
>
> Hi everyone,
>
> I'm PostgreSQL maintainer for Drupal and since people
are up in the air
> about what is going on down in the DB layer re
autoincrement and serial
> fields I figured i'd shed some light. Many people have
already stated
> this stuff in earlier mails and hopefully getting the
info in one big
> pile will make it more concrete.
>
> PostgreSQL does implement atomic sequencing with the
serial and
> bigserial types. MySQL does the same with the
auto_increment field
> property. These are used in Drupal to produce a
guaranteed unique ID
> for tables using them. Implementing this in PHP will
certainly generate
> non-unique IDs unless you lock tables thus introducing
a massive
> performance penalty.
>
> We use db_next_id() in Drupal to provide a
db-independent method of
> getting the next unique ID into the application for use
just after the
> insert operation on the DB. The performance hit of
using a sequence is
> next to nothing considering all that's involved is a
mutex lock, read,
> increment and mutex unlock. Making this more efficient
is simply not
> possible.
>
> Sure there are two different use cases for insertions:
those where you
> need the ID afterwards (e.g node) and those you don't
(e.g watchdog).
> If you want to write code without the use of
db_next_id() then do an
> INSERT and omit the id column. I don't see anyone
wanting to spend
> effort patching Drupal core to gain 0.01% efficiency
across the
> application.
>
> /me turns his torch off
>
> --
> Sammy Spets
> Synerger Pty Ltd
> http://synerger.com
>
> On 25-Apr-07 10:15, Barry Jaspan wrote:
> >
> > Could someone either clue me in or point to a
prior discussion about
> > why Drupal avoids using auto-incrementing database
columns? If it is
> > database compatibility, which versions of which
databases that we
> > support do not have it?
> >
> > Thanks,
> >
> > Barry
> >
>
|
|
[1-4]
|
|