NYCPHP Meetup

Mon Aug 13 13:03:55 EDT 2007

On 8/11/07, Elliotte Harold <elharo at metalab.unc.edu> wrote:

> Here's another common use case: extract the links from a web site. Do a
> Google-like reverse index that finds all the pages linking to this one.
> The only way to make that happen in a relational DB is to chop the
> content into so many trivially small pieces that putting them back
> together again is prohibitively expensive. And even once you've done
> that, the SQL to pull the result is ungodly ugly. The XQuery is a lot
> simpler because it matches the natural structure of the documents rather
> than treating everything as a table. Some data wants to live in tables.
> Some doesn't.
>

Ah, now _that's_ a great example, and something that CMS developers
often need to do after the fact (as in link-checking, or generating a
graph of sites you link to for SEO purposes).

My first instinct would be to look for XPath support in my relational
db, and indeed MySQL does this:
http://dev.mysql.com/tech-resources/articles/mysql-5.1-xml.html

But if a native-XML database can do it better or much more efficiently
for large datasets, then it is certainly worth investigating.

-- 
Chris Snyder
http://chxo.com/

NYCPHP Meetup

NYPHP.org

XML vs. rel DBs [was: Re: [nycphp-talk] Many pages: one script]