NYCPHP Meetup

XML vs. rel DBs [was: Re: [nycphp-talk] Many pages: one script]

Elliotte Harold elharo at metalab.unc.edu
Wed Aug 8 20:38:41 EDT 2007


Kenneth Downs wrote:

>> Then consider that you want to be able to make queries like, "Find all 
>> the paragraphs containing both the words 'Bush' and 'incompetent'" so 
>> you can't just shove everything into a BLOB.
>>
>>
> 
> Two words: text search.
> 

Nope, not the same thing at all.

Index engines like FAST and Lucene can do part of this (though they 
can't really take advantage of the structure of the documents they 
index). However those are *non-relational* systems. Of course if you 
want to search web-size collections, relational databases just can't 
handle it. Index engines are the only proven technology that can.

Mark Logic claims their native XML database can search web size 
collections too, but I remain to be convinced of that point.

Some relational databases have added non-relational, fulltext search 
extensions to their products just as some have added non-relational XML 
extensions. These are adequate for simple uses, if you don't push them 
too hard. However they are completely incapable of carrying out queries 
like, "Give me the title and first paragraph of every chapter of this 
book" (something Safari routinely does) because they don't see the 
structure of a document, only the text.


-- 
Elliotte Rusty Harold  elharo at metalab.unc.edu
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/



More information about the talk mailing list