NYCPHP Meetup

NYPHP.org

[nycphp-talk] Caching, proxies, sharding and other scaling questions

Jake McGraw jmcgraw1 at gmail.com
Fri Jul 24 19:29:29 EDT 2009


On Fri, Jul 24, 2009 at 6:45 PM, Ajai Khattri<ajai at bitblit.net> wrote:
> On Fri, 24 Jul 2009, Jake McGraw wrote:
>
>> Whats your data size like? How many requests per second do you plan on
>> handling?
>
> Its a very big site. Last year, we handled a total of 945 million page
> views. And we expect those numbers to go up of course :-)
>
>> a relational database to a key/value store (memcache is nice,
>> personally, I'm becoming a big fan of Redis) is to set up a single
>> instance and see how it handles the load.
>
> Yes, my thoughts exactly. (BTW, I also looked at Redis earlier today, but
> I have yet to see a comparison with memcache). Any thoughts?
>

Memcache is a proven product with a long (in web terms) history. Redis
is brand knew, RC for version 1.0 was just put out fairly recently.
The things I like about Redis are:

Data Persistence (not just in memory)
* Very easy to take a snapshot of your entire data store, just backup
the data dump dir.
* Very easy to prime a new data store. Let's say part of scaling
strategy includes mirroring your data, that is, you'll have multiple
cache servers with the same data. Simply take a snapshot of your data
dir, move the files to a new server and start redis.
* If your server goes down you can still recover information from the
last active state.

Lists
Redis is not just a key/value store, it also provides lists of values
under a single key. You can push, pop, get the length, get an
arbitrary value within a list and a bunch of other features. Doing all
of this computation within the provides two benefits: 1. No round trip
and (de)serialization, 2. Atomic transactions.

KEYS command for wildcards in key support.
http://code.google.com/p/redis/wiki/KeysCommand

Sets
Though I haven't played around with sets yet, they look pretty powerful.

In general, I think the KEYS and List commands makes the whole
key/value thing a lot easier to use when coming from an RDBMS
background. For performance information, check out this post:

http://groups.google.com/group/redis-db/browse_thread/thread/0c706a43bc78b0e5/455dd41883d90101#455dd41883d90101

- jake

>> For example, with modern
>> hardware, value look up from a single, untaxed instance of memcache
>> should take around 1ms. At a certain point, based almost entirely on
>> traffic, that'll go up. When it gets to an undesirable level, throw in
>> another memcache instance and hash the keys to spread the load (or
>> allow your memcached client to hash the keys for you). Continue this
>> until some other bottleneck rears its head.
>
> We know where the bottle necks are, so right now its a case of selecting
> some solutions to test with.
>
>
> --
> Aj.
>
> _______________________________________________
> New York PHP User Group Community Talk Mailing List
> http://lists.nyphp.org/mailman/listinfo/talk
>
> http://www.nyphp.org/show_participation.php
>



More information about the talk mailing list