NYCPHP Meetup

NYPHP.org

XML vs. rel DBs [was: Re: [nycphp-talk] Many pages: one script]

Elliotte Harold elharo at metalab.unc.edu
Sat Aug 11 23:07:02 EDT 2007


csnyder wrote:

> If it's not too much trouble, could you give us some other use cases
> for an XML database? Because title and first paragraph, if that's
> something a system "routinely does" could easily be stored as
> relational data at the time of import.
> 

Storing books, web pages, and the like in a relational database has only 
two basic approaches: make it a blob or cut it into tiny little pieces. 
The first eliminates search capabilities; the second performs like a dog.

You're right that if grabbing the title and the first paragraph is all 
you need to do, then these two pieces could be stored separately (and 
normalization be damned.) Now suppose the editor comes along and tells 
you they really want to show the first two paragraphs of text to 
non-logged in users instead of just the first one. If you know the 
queries in advance, you can layout the data to optimize them, but using 
the data's own structure will give you much more flexibility for 
unexpected uses.

Here's another common use case: extract the links from a web site. Do a 
Google-like reverse index that finds all the pages linking to this one. 
The only way to make that happen in a relational DB is to chop the 
content into so many trivially small pieces that putting them back 
together again is prohibitively expensive. And even once you've done 
that, the SQL to pull the result is ungodly ugly. The XQuery is a lot 
simpler because it matches the natural structure of the documents rather 
than treating everything as a table. Some data wants to live in tables. 
Some doesn't.

-- 
Elliotte Rusty Harold  elharo at metalab.unc.edu
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/



More information about the talk mailing list