XML vs. rel DBs [was: Re: [nycphp-talk] Many pages: one script]
elharo at metalab.unc.edu
Wed Aug 8 20:38:41 EDT 2007
Kenneth Downs wrote:
>> Then consider that you want to be able to make queries like, "Find all
>> the paragraphs containing both the words 'Bush' and 'incompetent'" so
>> you can't just shove everything into a BLOB.
> Two words: text search.
Nope, not the same thing at all.
Index engines like FAST and Lucene can do part of this (though they
can't really take advantage of the structure of the documents they
index). However those are *non-relational* systems. Of course if you
want to search web-size collections, relational databases just can't
handle it. Index engines are the only proven technology that can.
Mark Logic claims their native XML database can search web size
collections too, but I remain to be convinced of that point.
Some relational databases have added non-relational, fulltext search
extensions to their products just as some have added non-relational XML
extensions. These are adequate for simple uses, if you don't push them
too hard. However they are completely incapable of carrying out queries
like, "Give me the title and first paragraph of every chapter of this
book" (something Safari routinely does) because they don't see the
structure of a document, only the text.
Elliotte Rusty Harold elharo at metalab.unc.edu
Java I/O 2nd Edition Just Published!
More information about the talk