NYCPHP Meetup

NYPHP.org

What Does ORM Mean? Re: [nycphp-talk] ORM vs SQL: the ultimate showdown

Kenneth Downs ken at secdat.com
Sat Sep 15 13:05:36 EDT 2007


ORM is normally meant to mean a comprehensive strategy for treating 
tables as objects, and to recast relational concepts in object oriented 
terms.

The examples in the past few posts just ain't ORM, there is no 
showdown.  The examples are necessary, clever, and intelligent examples 
of simple abstractions of the interface of PHP to a  SQL database.

They don't display ORM, so they can't be a showdown between ORM and SQL.

A better title might be "Manual coding vs library routines."

Ben Sgro (ProjectSkyLine) wrote:
> Hello,
>
>    INSERT INTO users
>       (first_name,last_name,organization,address1, ... 20 more fields
> ..., birthday, favorite_color)
>    VALUES
>       (?,?,?,?,... 20 more question marks ...,?,?)
> ",array($first_name,$last_name,$organization,$address1,... 20 more
> fields...,$favorite_color,$birthday);
>
> Why not just do:
>
> SET ...
> first_name = $firstname,
> last_name = $lastname, ...
>
> Which is *basically* the same as what your saying ORM is useful for in 
> this case...
>
> - Ben
>
> ----- Original Message ----- From: "Paul Houle" <paul at devonianfarm.com>
> To: "NYPHP Talk" <talk at lists.nyphp.org>
> Sent: Saturday, September 15, 2007 12:50 PM
> Subject: [nycphp-talk] ORM vs SQL: the ultimate showdown
>
>
>>    I think of ORM as a partial solution to the problems of building 
>> database-backed web apps.  People call database applications "CRUD" 
>> apps, short for
>>
>> Create
>> Recall
>> Update
>> Delete
>>
>>    ORM is good for Create,  and for many cases of Update and Delete.  
>> It can be used to do Recall,  but can be disastrously slow for 
>> generating many kinds of reports.
>>
>>    Just the other day I was working with a Java Servlet-based 
>> webapp.  I entered a text string with a single quote,  which caused 
>> an error.  The cause was code that would have looked like this in PHP:
>>
>> $conn->Execute("INSERT INTO users (first_name,last_name,organization) 
>> VALUES ('$first_name','$last_name','$organization'");
>>
>>    This would have been OK if they'd written something like
>>
>> $first_name=addslashes($_POST["first_name"]);
>> $last_name=addslashes($_POST["last_name"]);
>> $organization=addslashes($_POST["organization"]);
>>
>>    But they didn't.  Neither do most developers.  Now,  there are 
>> tools that help with this:  for instance,  some database API's will 
>> automatically quote things if you use placeholders.  This style of 
>> doing things has it's own problems:  let's say the user table has a 
>> lot of fields...
>>
>> $conn->Execute("
>>    INSERT INTO users
>>       (first_name,last_name,organization,address1, ... 20 more fields 
>> ..., birthday, favorite_color)
>>    VALUES
>>       (?,?,?,?,... 20 more question marks ...,?,?)
>> ",array($first_name,$last_name,$organization,$address1,... 20 more 
>> fields...,$favorite_color,$birthday);
>>
>>    Notice the error?  I didn't get the order of the fields quite 
>> right , so the 'birthday' and 'favorite color' fields will get 
>> scrambled.  Imagine how hard it would have been to have noticed the 
>> error in the middle of the field list!  I recently caught this one in 
>> a Cold Fusion application --  this kind of code gets hard to maintain.
>>
>>    Enter ORM.  Syntax varies,  but this kind of insert is simple and 
>> maintainable with an ORM system:
>>
>> $t=$db->users;
>> $r=$t->new_record();
>> $r->first_name=$_POST["first_name"];
>> $r->last_name=$_POST["last_name"];
>> $r->organization=$_POST["organization"];
>> ...
>> $r->insert();
>> $user_id=$r->user_id;
>>
>>    The ORM system takes care of quotes and bad data for you.  You're 
>> not likely to introduce bugs when you add and remove fields,  and 
>> they'll be transparent if you do.  If you want to make life really 
>> easy,  you might write a subroutine that uses the ORM's introspection 
>> capability,  so you can just write
>>
>> populate_from_post($r,$_POST);
>>
>>    that scans through the database fields,  and automatically assigns 
>> post variables down to fields.  Talk about zero maintainance!
>>
>> -------
>>
>>    ORM and other SQL-phobic approaches can also get you into big 
>> trouble. I once inherited a PHP app,  using MS SQL server as the back 
>> end that had an administrative interface that listed all of the users 
>> in the system. It worked just fine in testing,  but got REALLY slow 
>> when a few hundred users joined the system...  It could take more 
>> than 100 seconds to produce the screen!
>>
>>    The problem was that the system did between 10 and 20 queries to 
>> produce each output row.  It first did a query that retrieved a few 
>> hundred rows,  and then it did more queries to get all the details.  
>> Doing 5000 queries took a really long time.
>>
>>    We changed the application to use prepared statements and found 
>> that the query took 10 seconds.  This was acceptable,  but still 
>> pretty bad.
>>
>>    Although this system didn't use an ORM,  the general approach was 
>> that encouraged by most ORM systems -- I don't think any ORM system 
>> would have been smart enough to consolidate the multiple queries into 
>> a smaller number,  because the individual-row queries were 
>> implementing quite complicated business rules that involved counting 
>> rows with certain attributes,  preparing thresholds,  etc.
>>
>>    An ORM system might or might not be smart enough to generate 
>> prepared statements.  The difference between a 'smart' and 'dumb' ORM 
>> would be a factor of 10 in performance in this case.
>>
>>    Later on I wrote a 'pure SQL' query that calculated everything in 
>> one complex query.  It used a number of subselects...  I thought it 
>> was pretty straightforward,  but many people aren't comfortable with 
>> queries that are this complex.  It did the job in 0.1 seconds!!  
>> That's a factor of 100x better than I got from running separate 
>> statements.
>>
>> -----------------------
>>
>>    Similar issues turn up with UPDATEs and DELETEs.  ORM is quite 
>> efficient if you only want to update one or two records at a time,  
>> but imagine you want to update 100,000 rows.  (Turning off service 
>> for people who didn't pay their bills,  reset scores in an online 
>> game to zero, ...) It can be thousands of times faster to do 
>> something like:
>>
>> UPDATE user SET score=0;
>>
>> than to write some loop that runs hundreds of thousands of queries.  
>> Once you add in the issues of concurrency and transactions,  the 
>> 'pure SQL' solution looks a lot better -- it's automatically 
>> protected by the transactional integrity of the database.  Probably 
>> 80% of people working with an ORM system will forget to put the whole 
>> thing in a transaction: so it won't be reliable.  The 20% of the 
>> people who do put it in a transaction will hold locks on the database 
>> for hundreds or thousands of times longer than they need to...  Which 
>> slows down the site for everybody else.
>>
>> --------------------------
>>
>>    The gold standard is use an ORM system that's tightly integrated 
>> with your framework when it's appropriate:  when you're manipulating 
>> a few rows at a time.  You'll get big gains in maintainability.  When 
>> you're updating 1000+ rows or generating complex reports,  you need 
>> different tools.  I've built a few systems that integrate data grid 
>> display on the client with a prebuilt set of data fields that can be 
>> incorporated into the query, using subselects to 'join' data from 
>> other tables -- this gets the silver...  The gold medal would go to a 
>> system that uses the database metadata from an ORM system to help you 
>> build queries.  Hook this up to an AJAX data grid,  say the one from 
>> ext,  and you've got a rails killer...
>> _______________________________________________
>> New York PHP Community Talk Mailing List
>> http://lists.nyphp.org/mailman/listinfo/talk
>>
>> NYPHPCon 2006 Presentations Online
>> http://www.nyphpcon.com
>>
>> Show Your Participation in New York PHP
>> http://www.nyphp.org/show_participation.php 
>
> _______________________________________________
> New York PHP Community Talk Mailing List
> http://lists.nyphp.org/mailman/listinfo/talk
>
> NYPHPCon 2006 Presentations Online
> http://www.nyphpcon.com
>
> Show Your Participation in New York PHP
> http://www.nyphp.org/show_participation.php


-- 
Kenneth Downs
Secure Data Software, Inc.
www.secdat.com    www.andromeda-project.org
631-689-7200   Fax: 631-689-0527
cell: 631-379-0010




More information about the talk mailing list