NYCPHP Meetup

NYPHP.org

[nycphp-talk] table structure for "friend" relationships

Paul A Houle paul at devonianfarm.com
Fri Jul 31 11:14:41 EDT 2009


Hans Zaunere wrote:
> In many modern web application, relational pedantic have been fading.
> Denormalization is often a good thing, and very large sites are in fact
> highly denormalized.  This is due to the type of data they need to store
> (friend relationships is a classic one), and the level of throughput they
> need to achieve.
>   
    I think about 90% of people who scream "down with normalization" are 
the same people who were creating Microsoft Access databases 10 years 
ago with three phone number columns rather than creating a separate 
table for phone numbers.  (I spent about 6 months fixing those kind of 
apps,  and that's enough for me...)

    In the "social media" sorts of applications,  I think it's wise to 
pursue an 'eventually consistent' strategy.  If you don't think about 
consistency at all,  you're going to have a day that your database 
wrecks and you find it's a pile of spaghetti that doesn't make any 
sense.  One strategy is to have a normalized 'core' and then denormalize 
the data to make views that are very fast.  I'm doing a lot of semantic 
web stuff these days and I'm finding that I need to create materialized 
views all the time.

    Myself I'm feeling torn between wanting schema flexibility and 
wanting to have richer schemas:  I want more data types and data 
dictionaries that record more about what data in the database means:  
with a rich schema,  many parts of your apps can "write themselves."

     I've also got a lot of fear that "schemaless" systems are going to 
be "futureless" systems.  Look at the sad story of Java serialization 
and object databases:  once an object gets persisted into a database,  
you can't make changes to it as easily as you can make changes to an 
ordinary object that lives in RAM.  Object database vendors haven't come 
up with a decent story for how to migrate databases over the long term.  
On the other hand,  I've seen relational databases in business 
applications that have survived 5, 10,  even 25 years of changing 
business requirements.  Relational databases seem to have hit about the 
right balance between being easy and hard to change the schema.

----------

     My dream database is something like the CycL system created for the 
Cyc project,  though I'd add full RDF capability and dump most of the 
Cyc ontology on the floor.  Add SPARQL and relational-mapped views... 



More information about the talk mailing list