[nycphp-talk] XML Manipulation

Kenneth Downs ken at
Fri Aug 17 14:24:50 EDT 2007

Mark Armendariz wrote:
> csnyder wrote:
>> On 8/16/07, Dan Cech <dcech at> wrote:
>>> If you're looking for ideas, here is a proof-of-concept I put together
>>> back in 2004 for a system using a modified preorder traversal tree for
>>> threaded messages.
>> he nice thing about using some sort of tree (we called it a
>> nested-set model) for messages is that you get true threading--this is
>> a reply to that--rather than just a flat chronological list of
>> replies.
>> Very often, people just want the flat list as it is easier to follow.
> Nested-set grows rather hairy in no time, especially with data that 
> needs to be changed often.  After some reading (thanks to Elliotte's 
> compass - thank you sir), it seems to me XML could definitely prove a 
> better means of tracking hierarchal information, which especially 
> includes threaded conversation.  It seems to me that flattening tree 
> data is far easier than branching flat data (and meta fields in my rel 
> db makes me queasy).

What is generally called "adjacent pair" is easiest, thats the  
record_id, record_id_parent scenario you see in many places.

The bit of SQL you need to make it child's play is the "WITH RECURSE" 
feature, which is sadly not widely supported:

SELECT record_id, text_of_message
   FROM messages  chd
   JOIN messages par  ON chd.record_id_par = par.record_id WITH RECURSE
  WHERE chd.record_id = $x

This construct does the Right Thing.  But as I said, it is almost 
completely unsupported.  It is a true black eye IMHO that nearly all big 
dbms vendors have ignored this incredibly useful extension.

Also, don't confuse the nature of the data with its presentation.  In 
terms of structure, all messaging systems are storing the same 
structure, be it emails, blogs with responses, usenet, or message 
boards.  All of them are storing a root message with replies and 
replies-to-replies and so forth.  It is a //hierarchy of like items//. 

It terms of presentation you can go treeview, flattened (i think phpbb 
does that), or indented (like some blogging systems), but the matter of 
presentation can even be made a user option and its child's play to 
support any presentation if the data is structured well to begin with.

The best way to decide between files (as in XML, JSON, or YAML or 
plaintext) and table-based dbms is overhead.  How much down payment do 
you have to make to get the result? 

> Mark
> ------------------------------------------------------------------------
> _______________________________________________
> New York PHP Community Talk Mailing List
> NYPHPCon 2006 Presentations Online
> Show Your Participation in New York PHP

Kenneth Downs
Secure Data Software, Inc.
631-689-7200   Fax: 631-689-0527
cell: 631-379-0010

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the talk mailing list