NYCPHP Meetup

Wed Oct 7 14:23:23 EDT 2009

On Tue, Oct 6, 2009 at 2:27 PM, tedd <tedd at sperling.com> wrote:
> At 2:06 PM -0400 10/6/09, Dan Cech wrote:
>>
>> tedd wrote:
>>  > Dan:
>>>
>>>  That's a good point, but I still like having a "separate from data"
>>>  auto_increment index for all records. Besides adding one field to a
>>>  record, can you see any harm in that?
>>>
>>>  Cheers,
>>>
>>>  tedd
>>
>> You also end up maintaining separate indexes to support the lookups you
>> actually want to do:
>>
>> Say you want to be able to do lookups in both directions, like:
>>
>> SELECT account_id FROM some_rel WHERE user_id=X
>>
>> and
>>
>> SELECT user_id FROM some_rel WHERE account_id=Y
>>
>> The most efficient system for doing this is:
>>
>> CREATE TABLE some_rel (
>>  user_id int(11) not null,
>>  account_id int(11) not null,
>>  PRIMARY KEY (user_id,account_id),
>>  KEY account_id (account_id)
>> );
>>
>> The primary key is used for lookups by user_id, and the second key for
>> lookups via account_id.
>>
>> If you add an auto_increment column, you then have to do:
>>
>> CREATE TABLE some_rel (
>>  some_rel_id int(11) not null auto_increment,
>>  user_id int(11) not null,
>>  account_id int(11) not null,
>>  PRIMARY KEY (some_rel_id),
>>  UNIQUE KEY user_id (user_id,account_id),
>>  KEY account_id (account_id)
>> );
>>
>> You need a unique index on user_id,account_id to enforce the
>> relationship rule.
>>
>> The account_id lookup cannot leverage the primary key, so you either
>> have to add the user_id field to that index or accept that it's going to
>> have to read the primary key and look up the user_ids from the table
>> itself.
>>
>> Dan
>
> Dan:
>
> Interesting consideration.
>
> As I said, I typically have a "separate from data" id that pertains to the
> individual record. A many:many record may not be unique, but may include a
> date field or other such distinguishing character. As such, I usually like
> getting/setting such records by their ids rather than using multiple fields.
> Of course, you get the id by using multiple fields, but then dealing with
> the record itself, you have a unique number and I find that simpler.
>
> Now does that led to a faster database lookup? Probably not, but that may be
> the price I pay for keeping things simple for me.
>
> Do you see any way for me to be happy and my db queries fast?
>
> Cheers,
>
> tedd
>
> --
> -------
> http://sperling.com  http://ancientstones.com  http://earthstones.com
> _______________________________________________
> New York PHP Users Group Community Talk Mailing List
> http://lists.nyphp.org/mailman/listinfo/talk
>
> http://www.nyphp.org/Show-Participation
>

I honestly wonder how significant the hit is between having an auto
increment primary key versus a composite primary key over two
integers...

I was always taught the primary key is something unique to each row,
and specific to that row (like a unique name, SSN, product_id, etc.).
The idea of auto increment came about when everyone was freaking out
about linux only supporting files up to 2GBs and mysql table data
going into one file. This was also a big deal back when RAM cost more
than cocaine.

Nowadays I am not so sure this approach is practical, and believe it
is simply ingrained as we were all taught to think this way.

OTOH most object/document databases generate an internal identifier
for scale, sharding and whatnot (like MongoDB). So there are valid
reasons for going the auto increment route, but I do believe too many
people just automatically design every table to rely upon it.

-- Mitch

NYCPHP Meetup

NYPHP.org

[nycphp-talk] Database, table, and column naming schemes