NYCPHP Meetup

NYPHP.org

[nycphp-talk] Link to article about nested sets

csnyder chsnyder at gmail.com
Mon Feb 2 18:02:18 EST 2009


On Mon, Feb 2, 2009 at 4:14 PM, Dan Cech <dcech at phpwerx.net> wrote:
> Ahh yes, good old nested sets.  I still prefer the hybrid approach I
> took with phpGACL (http://phpgacl.sf.net) and my old clew demo that
> stores both both adjacency and nsm data.
>
> For a very small storage overhead you get the benefits of both
> approaches (try pulling immediate children from an nsm tree) and the
> ability to rebuild the nsm data from scratch if it gets especially
> messed up (not that there would ever be bugs...).
>
> I've been meaning to sit down and try to figure out a variation using
> intervals between the lft and rgt values so that inserts are cheaper,
> just haven't had a project that needed it enough to motivate me yet!

We use left+right+parentId, so that immediate children are easy to find.

It works pretty well to bulk-insert a bunch of empty nodes and then
use them up as you go (to cut down on insert time). That's not going
to be a good solution if your inserts are really spread out all over
the tree, but in the apps I've built so far the majority of inserts
happen at just a few nodes. Cron can be used to keep a ready supply of
fresh nodes in those locations, and you take nearly the same
performance hit whether you insert one or 50.

If you use NSM, definitely write a test script that bulks up the tree
so you can see what performance is like with 50k nodes vs. 500k nodes.
You're going to notice, and you need to be prepared for it.

Some kind of interval-based partitioning will probably be necessary if
you need to run on a standard server with > 100k nodes, or if you are
going to try to catch a lot of inserts at the same time.



More information about the talk mailing list