NYCPHP Meetup

NYPHP.org

[nycphp-talk] data modelling vs. db design (was: ER Diagram tool for MySQL/OS X)

inforequest 1j0lkq002 at sneakemail.com
Mon Oct 10 23:11:30 EDT 2005


Hans Zaunere lists-at-zaunere.com |nyphp dev/internal group use| wrote:

> Normalization is often one of the most misunderstood aspects of relational
>
>databases.  Of course, it's what makes a relational database relational, and
>we're all taught that you're not worth a cent unless you only work in the
>5th normalized form.
>
>Well, perhaps if you're a mathematician, then that's correct.  The reality
>is that over-normalization can kill a project just as much as
>under-normalization.  Don't forget - at least from the old SQL books I
>learned from - why normalization was being preached.  "Don't have duplicate
>data!" was often the mantra.  If you had more than one guy named "John",
>then you need to have a table of first names, with a lookup table relating
>first names with the other data you're storing.  Seems rediculous, right?
>
>The practice of heavy normalization was based on some old, bygone, concerns,
>namely "storage is expensive."  Add in academic chest-beating, and pretty
>soon you have a table for every letter of the alphabet and look-up tables to
>match.
>
>As I think we generally find, with the abundance of hardware resources,
>normalization is reduced to only an element of good design.  Sure,
>performance can still be a factor, but let's consider the performance of an
>entire system, and not just what's happening in the database.
>
>Not considering the overhead involved in joining tables (and thus resolving
>disparate blocks across the drive(s)), there's overhead in the application,
>and in a big way.  First is the performance of dealing with duplicate rows -
>you know, the throwing out of wasted rows from a one-many relationships.
>That not only incurs network traffic - which is often the slowest part of a
>modern internet system - it also incurs processing and memory waste on the
>application server.
>
>And what about the overhead of development itself?  Large complex queries,
>sub-queries, joins, and SQL query management in any language can make for
>unmaintable and tedious code.
>
>So it's a balancing act - as always.  The degree of normalization is
>generally more of an art than a science, and the "correct" amount is
>different for every project, and for every database architect.  Get three
>database architects, and you'll get four recommendations for the correct
>schema.
>
>
>---
>Hans Zaunere / President / New York PHP
>   www.nyphp.org  /  www.nyphp.com
>

Yeah. What he said.


-=john andrews
http://www.seo-fun.com




More information about the talk mailing list