NYCPHP Meetup

NYPHP.org

[nycphp-talk] data modelling vs. db design (was: ER Diagram tool for MySQL/OS X)

Dave Callaghan dcallaghan at linuxmail.org
Tue Oct 4 09:30:51 EDT 2005


> Can anyone suggest reasons why it's better to have fewer tables, or to
> try and combine different types of objects into the same table?  Surely
> it's something to do with better performance (fewer joins, simpler
> queries), right?  Is this a common principle of design that I haven't
> learned yet?

Big tables (may) have big full table scans while multiple small tables (may) impact IO.

There is an impact with regard to full table scans depending on how many blocks your max row takes up. Of course, this will depend on how the columns are defined. Its easy to imagine a table with three column of BLOBS taking longer to scan then a table with 20 columns of single bytes.

There is also some simple math to estimate the impact of splitting the table out into multiple tables. You'll have 1+x primary keys (K) to maintain instead of just one. The height of your index (H) will also have an impact on IO. So, for every row retrieved, you can compare your baseline IO for a single table to K(H+1). Of course, this assumes that you are making this join frequently.

So, if you know that to actually describe the entity you will always need x attributes, and you split these x attributes into n tables where n > 1, you will likely suffer an unnecessary performance penalty. However, if to typically describe an entity you just need a subset of x (for example, employee contact info) and there is another subset y that is rarely accessed and would make for a large table scan (employee image), I might consider splitting them up.

Dave Callaghan
programmer
dcallaghan at linuxmail.org 

-- 
______________________________________________
Check out the latest SMS services @ http://www.linuxmail.org
This allows you to send and receive SMS through your mailbox.

Powered by Outblaze



More information about the talk mailing list