[nycphp-talk] Caching, proxies, sharding and other scaling questions

Jake McGraw jmcgraw1 at
Fri Jul 24 19:29:29 EDT 2009

On Fri, Jul 24, 2009 at 6:45 PM, Ajai Khattri<ajai at> wrote:
> On Fri, 24 Jul 2009, Jake McGraw wrote:
>> Whats your data size like? How many requests per second do you plan on
>> handling?
> Its a very big site. Last year, we handled a total of 945 million page
> views. And we expect those numbers to go up of course :-)
>> a relational database to a key/value store (memcache is nice,
>> personally, I'm becoming a big fan of Redis) is to set up a single
>> instance and see how it handles the load.
> Yes, my thoughts exactly. (BTW, I also looked at Redis earlier today, but
> I have yet to see a comparison with memcache). Any thoughts?

Memcache is a proven product with a long (in web terms) history. Redis
is brand knew, RC for version 1.0 was just put out fairly recently.
The things I like about Redis are:

Data Persistence (not just in memory)
* Very easy to take a snapshot of your entire data store, just backup
the data dump dir.
* Very easy to prime a new data store. Let's say part of scaling
strategy includes mirroring your data, that is, you'll have multiple
cache servers with the same data. Simply take a snapshot of your data
dir, move the files to a new server and start redis.
* If your server goes down you can still recover information from the
last active state.

Redis is not just a key/value store, it also provides lists of values
under a single key. You can push, pop, get the length, get an
arbitrary value within a list and a bunch of other features. Doing all
of this computation within the provides two benefits: 1. No round trip
and (de)serialization, 2. Atomic transactions.

KEYS command for wildcards in key support.

Though I haven't played around with sets yet, they look pretty powerful.

In general, I think the KEYS and List commands makes the whole
key/value thing a lot easier to use when coming from an RDBMS
background. For performance information, check out this post:

- jake

>> For example, with modern
>> hardware, value look up from a single, untaxed instance of memcache
>> should take around 1ms. At a certain point, based almost entirely on
>> traffic, that'll go up. When it gets to an undesirable level, throw in
>> another memcache instance and hash the keys to spread the load (or
>> allow your memcached client to hash the keys for you). Continue this
>> until some other bottleneck rears its head.
> We know where the bottle necks are, so right now its a case of selecting
> some solutions to test with.
> --
> Aj.
> _______________________________________________
> New York PHP User Group Community Talk Mailing List

More information about the talk mailing list