NYCPHP Meetup

NYPHP.org

[nycphp-talk] PHP caching (not opcode)

bzcoder bzcoder at bzcode.com
Fri Jul 18 11:54:20 EDT 2008


Patrick May wrote:
> Hello,
>
> I'm working on a PHP cache library, and I wanted to check in and see 
> what folks thought.
>
> Usually, you have many different front end views that you will want to 
> cache.  These may pull from multiple datasources in odd ways.  Far 
> away from the front end, there is backend CMS, either a tool like 
> Wordpress or Drupal, or a custom tool.  Generally, I find myself 
> managing caches by:
>
> 1. Putting the code that generates the cache into the front end, since 
> it is the design that dictates which pieces of information should be 
> grouped together.
> 2. Putting a very small amount of code in the back end which causes 
> the cache to be deleted on the appropriate CRUD operations.  CMS's are 
> complicated enough as they are, and I don't want to clutter them up 
> more then necessary.
>
> The hard part is creating the association between the front and back 
> end.  What I want to do is to use *tags* to create this association, 
> and to break caches.

Part of this has to do with your system.  For example, Joomla is made up 
of Templates, Components, Modules, and Plugins all of which go into 
generating a page.

A component will generate HTML content for the main section of the 
website(think displaying articles, displaying a directory, etc).
A Module will generate a small block of HTML for specific purposes(think 
"most popular articles" list, "advertising banners", "number of users 
online")
A template is used to put all of that together into the layout of the page.

Finally, a plugin will take that page and do a bunch of regular 
expression find and replaces(so in your content, if you put {contact 
author} the author of an articles contact information is placed on the 
page).

Therefore, Joomla has multiple levels of caching.  As I recall, you can 
have it cache the output of specific modules and you can have it cache 
the complete page output. 

Now, compare that to the Smarty template system.  In Smarty, you can 
cache the results of applying a template to generate HTML.   Smarty also 
has an invalidate function, so whenever something is changed you can run 
the invalidate function and kill the cached object.


My feeling is it really needs to be taken to the next level.  So, when a 
cached object is generated, that object needs to record somewhere what 
underlying data points make up the cache. 

So, for example, your looking at a listing of all articles relating to 
PHP and Caching.

As a first step, if an article is added, most systems just do a generic 
"invalidate all article cache files" - but a more specific function 
would say "lookup all files related to caching that where cached tagged 
with X and invalidate them".

A smarter step would be to refresh the cache if possible.  So instead of 
deleting the existing cached files, generate new ones.  Considering that 
create/update operations are already expected to be slow, making them a 
few milliseconds slower is not a big deal.  Especially since the editor 
is a user you don't need to worry about getting upset at site slowness 
and moving on to a competitors site.  Wheras the reader is the person 
who you have to be quick for.

However, I will say the main issue is that developers have to USE the 
cache.  It does no good to have a method of storing data in a cache 
unless the developers of the components trigger the flush/refresh 
mechanisms that are already there and most CMS systems have(similiar to 
the logging facility.  It seems every framework has one, yet how often 
do people actually add a bunch of extra code to log information except 
for failures).





More information about the talk mailing list