NYCPHP Meetup

Mon May 7 10:35:42 EDT 2012

Thank you for the thorough reply Anthony. My replies are also in line here:

On Mon, May 7, 2012 at 10:13 AM, Anthony Ferrara <ircmaxell at gmail.com>wrote:

> Hey Justin,
>
> Replies inline;
>
> On Mon, May 7, 2012 at 8:16 AM, Justin Demaris <justin.demaris at gmail.com>
> wrote:
> > Hello PHP Talkers,
> >
> > How do you guys feel about ORM systems and other database abstraction
> > layers?
>
> Personally, I do not like them.  I find that the ORM layers
> specifically violate SRP (Single Responsibility Principle) and are
> usually not worth while.  Instead, I prefer a light weight Data Mapper
> layer ( http://martinfowler.com/eaaCatalog/dataMapper.html ).  I
> usually hard-code SQL in the mapper, but you *could* use an ORM there.
>  But the key is to keep it completely separated from the business
> objects.  That way your data model is free to evolve independently
> from your business objects...
>
> > Is there any ORM system out there
> > that just does it right? Namely, I'd be looking for things like:
> >
> > 1) When I instantiate an object by it's ID multiple times, it doesn't
> bother
> > to hit up the database after the first time, but just keeps giving me
> copies
> > of the same object
>
> Well, do you really want this all of the time?  I can see cases where
> you do want to re-query the database (especially in sensitive
> areas)...
>

My issue ends up being that we can't always expect everyone on the team
(it's a pretty large team) to deeply understand the proper way of doing
everything. I want this feature as a way to help prevent people from
shooting themselves in the foot, so yes, I do want this all of the time. I
have seen a lot of code that pretty much assumes the ORM objects are just
like any other variable and the objects can be instantiated very often to
make the logic of a for loop simpler.

>
> > 2) Lazy load the object values. There are a number of patterns where I've
> > seen people instantiate a bunch of objects and then only use a small
> subset
> > of them. It would be nice if the object only loaded the data when we try
> to
> > reference one of its non-ID properties.
>
> Why?  Your business objects should have all the data they need at
> creation time.  Otherwise you can wind up in the situation where
> objects representing the same state have different states.  Which is
> not a good thing...
>

See above. Retraining everyone is not an option, and I've seen a lot of
people assume that if they aren't accessing the data, then the object
should be a very cheap call and they instantiate it frivolously. We can all
argue that you need to know your platform in detail in order to really use
it right and it certain things will be better in optimal cases, but these
are the variables I have to work with. The other issue is that at small
scale, most of the time the performance is still quite acceptable, but as
the product grows and the scale grows, these cases become more obvious.
These kind of lazy loads enable me to pass objects around to functions that
may need them, but only incur the database hit if the function does end up
needing them.

At the same time, I do welcome best practice recommendations for myself and
for the rare cases when I get complete control of the architecture and
implementation :)

>
> > 3) Ability to tweak the back end to work with other database systems
> > (especially Riak, Mongo and Cassandra)
>
> Unless you're building a project that you want to distribute to
> others, I find this a bad feature.  Sure, it's nice to know you can
> switch to another storage with a config change.  But practically, how
> often do you do that?  And if you do that, you're going to want to
> tune your data model specifically towards the strengths of the target
> backend.  Otherwise you're constantly stuck with a sub-optimal
> solution, and little way out when you need to scale...
>

This isn't so much something that I want to consider as "completely
extensible" or that I would want other people to extend my platform for.
This is because I actually use hybrid MySQL/Mongo/Redis, MySql/Cassandra,
and MySQL/Riak/Redis environments for my projects and I need it to work
that way now. If Doctrine is awesome for all things MySQL, but very painful
to work in Cassandra support, then it wouldn't be my choice for that
project. The fact that it has some Mongo support already is wonderful news!

>
> > I have had really good luck in the past working with Yii and integrating
> > with Redis to use their Active Record structure, but I'm not sure of the
> > performance there. Also, I've been hearing a lot about Doctrine 2 lately
> and
> > the necessity of having an extra Data Mapper layer in the middle that
> > separates the classes and properties from the fields and tables that
> store
> > the data.
>
> That mapper layer is a good thing...  It allows both to vary
> independently of each other.  Which avoids un-needed coupling between
> the layers.
>
> As far as Doctrine 2 goes, IMHO it's the best layer out there.  I use
> it when I have projects that dictate requirements that it helps fill.
> But usually, I just use the database specific layer (MySQLi or
> MongoDB).
>

When working with Mongo, do you just pass around the stdClass / array
representation of the document to do work on it? Do you do anything at the
code level to force a standard attribute list for it?

>
> I hope that helps,
>

That does help, quite a bit :-D thanks for the time spent on the reply.

>
> Anthony
> _______________________________________________
> New York PHP User Group Community Talk Mailing List
> http://lists.nyphp.org/mailman/listinfo/talk
>
> http://www.nyphp.org/show-participation
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nyphp.org/pipermail/talk/attachments/20120507/cf2ebc68/attachment.html>

NYCPHP Meetup

NYPHP.org

[nycphp-talk] Database Abstraction / ORM