NYCPHP Meetup

Tue Jan 18 23:13:56 EST 2005

> > Those who know me know I don't favor DB abstraction, so this will be biased, and
> frankly I don't know that much about PEAR::DB.  Can it support prepared statements?
> Can it support cursors (server side)?  Can it support collation? etc...
> 
> Sounds funny coming from a MySQL guy. (rimshot)

Hmm, not sure how to read that one :)

> > While DB abstraction can be useful for simple apps, and is something worth
> learning about, large applications never use generic abstraction, since they are
> rarely deployed on multiple databases.
> 
> All I can tell you is that after working on sites that get 10 million
> visitors daily, had I coded anything to a native API the whole thing
> would have died a horrible death!  When you are really pushing the
> limits, native APIs are what kill you, not save you - as you end up
> worrying more about scaling and redundancy, where a more generic
> communication-presentation-application-data layered approach will save
> your bacon time after time. Something like:

That's an interesting take, that is, adding code to enforce proper architecture.  I see what you're saying, but at the end of the day the more code something executes, the slower it'll be.  Now that's not to say a minor abstraction for proper replication handling, for example, isn't a good thing.

>   Load Balancers <=> Webservers <=> Application Servers <=> Databases
> 
> As a matter of fact, the biggest site I ever worked on ended up going
> the java route, where database transparency was one of the main goals.
> (Not my decision as I had moved on, but you get the point.)

Quite true, and frankly, Java kind of has the best of both worlds.  JDBC drivers are "abstractions" but written particularly for that database, ie, they expose the same interface but are native calls themselves.  It's a totally different layer of abstraction than, let's say, PEAR::DB.

> If you implement an application that maxes out mysql's functionality
> for example (mitchy bites tongue), you are then going to lose a lot of
> effort if you decide to put SQL Relay in front of your database
> servers. Ok, maybe not logical for MySQL, but for something like
> Oracle (where OCI connections have lots of overhead) this progression
> happens frequently, and apps end up getting partially rewritten in
> order to decouple the logic from the database layer to accomodate the
> connection pool...

Agreed, but this is most often and best done with application layer DB abstraction.  Per above, implementing fault tolerance with replication requires some layer of abstraction, but I wouldn't (and couldn't) do it with a generic abstraction.  Fault tolerance between, let's say, MySQL and Oracle, just wouldn't work with a generic abstraction that tries to cover PostgreSQL, MSSQL, SQL Lite, Access, Sybase, and TimesTen.

> > That said, abstraction within the application - that is, "custom" abstraction
> written specifically in the application, and for their database - can add a lot of
> value, since it can expose the needed features for that db/app combo.  Lastly,
> knowing how a database works, and thus how to talk to it in the most efficient
> manner, is central in a successful and scalable application.
> 
> The first sentence is dead on, and Mambo's database class is a good
> example. Still native MySQL calls, but nice abstraction to make the
> interface simple and predictable for 3rd party developers.

Agreed 100%; this is application layer database abstraction, not generic.

> Knowing how databases work is *critical* to learning how to write
> useful software, as poorly-designed applications are also difficult to
> maintain, let alone scale.
> 
> Which does bring up a great debate - where do you draw the line
> between portability and native functionality?  If you have PostgreSQL
> or Oracle, then does it make sense to load up all of your DML into the
> database as stored procedures, for example? I used to think that way,
> but also was convinced at the time that there were limited options on
> the database server side; and now that is just not the case, with
> PostgreSQL and Firebird around (and yes, even MySQL is catching up
> with 5.0)   ;-)

That actually brings up an excellent point.  Stored procedures are the ultimate in database abstraction, and frankly makes the most sense.  At the same time, companies are founded and make millions on porting stored procedure code, so in the end, portability is often a marketing term.  No database implements the same stored procedure language, and although SQL2003's PSM language is considered standard these days (which MySQL is implementing from scratch after learning the mistakes of everyone else), there's still a porting process to the standard.

DB abstraction can happen at various levels, and should.  But overly abstracting database access is often more of a hassle than advantage, especially as the complexity of the application increases.

And I smell the seeds of a presentation here :)

---
Hans Zaunere
President, Founder
New York PHP
http://www.nyphp.org

Gmail: The 1gb spam catcher 

NYCPHP Meetup

NYPHP.org

[nycphp-talk] if you were teaching PHP...