NYCPHP Meetup

NYPHP.org

[nycphp-talk] LARGE mysql database performance

Hans Zaunere lists at zaunere.com
Tue Jan 31 23:27:39 EST 2006



edward potter wrote on Tuesday, January 31, 2006 4:31 PM:
> My understanding is all the GOOG adsense, adwords delivery back-end
> also uses MySQL?  Can't get any bigger then that!

Well, some of that is true.  :)

> GOOG earnings at 4:30.  The street awaits!  :-)  ed

Uh oh...

> > > So I am writing a "security camera" app at work, and using mysql/php.
> > > I have the cameras uploading snapshots via ftp to the server.
> > > 
> > > Now for dealing with the images, and indexing them for easy
> > > searching, I 
> > want to throw the info into a mysql database. I wouldnt mind also
> > sticking the image data in the table too, so that you HAD to go through
> > the app to view the images instead of direct url linking, but was

As mentioned already, preventing direct links to files can be easily done
without storing them in a database.

> > wondering, how does mysql react with a 20-30gb database? would this be
> > very bad? I have run a forum that used vbulletin with about a 500mb
> > database, but that did not have any issues at all.... but 20-30gb is
> > quite the large database :p

Just as the size of Google's ad database could be considered large, 30gb
could be considered large, too.

MySQL's performance can vary greatly depending on traffic patterns.  For
instance, read heavy applications can perform well, while read/write
applications can perform poorly.  But mixing storage engines can help.
Putting mostly read-only information (like images, ie, a snapshot is taken
and stored) in a MyISAM table (and potentially different disk), and putting
read/write data in InnoDB (meta data like view counts), can maximize how
MySQL can perform.

And perhaps the most important factor is what constitutes "large."  Are
there a few very large blocks of data, or is that there are many smaller
blocks of data?  These types of considerations have an impact on any data
storage strategy.

> > > Any thoughts or experiences?

I'm generally a fan of storing large static data pieces on the filesystem.
Huge numbers of files can be handled, and throughput can be maximized.

When storing this type of data in a relational database, bits have to be
read from disk, through the database engine, and then through the network,
into a language like PHP, and then again pushed out through the network via
something like Apache.  Compare this with Apache being able to grab a file
directly from disk and pushing it onto the network.  Sure, caching techiques
can help with a database, but they still won't match disk caching.

End of the day, large static data is suited for disk in most architectures.
In extreme cases, Apache won't cut it either.  All the large image caches of
the Internet use the most basic HTTP servers that are optimized for data
delivery, versus something like Apache/PHP that is optimized for
functionality - but that's a different story.  :)



---
Hans Zaunere / President / New York PHP
   www.nyphp.org  /  www.nyphp.com





More information about the talk mailing list