Caching with PHP5
Question: Whats the best way to improve performance of your website?
Answer: Get rid of it, and stop worrying.
Unfortunately, that's not always practical (livelihoods etc). So lets have a look at caching instead. Rather miraculously I've just written and published some nifty caching code.
The code is PHP5 only, and built with a static class mindset. This kinda uses the OOP system as namespaces, though also uses inheritance to reuse common code. The code is separated out into three classes, Cache, OutputCache and DataCache. Groups and unique IDs are used to identify individual cached content. This comes in handy if you have to clear just a certain section of the cached data.
The Cache class is the base class, and contains common code for generating filenames, and reading and writing data files. Most of the code here is protected, as you shouldn't be interfacing with this class directly except in one instance, which is enabling or disabling the cache.
Output Cache
The OutputCache class is used for caching the generated output of your scripts, or certain sections of them. It has Start and End methods, and is used like this:
<?php
if (!OutputCache::Start("myGroup", "myID", 600)) {
// Generate some output (as you do)...
OutputCache::End();
}
?>
So whats happening here? Well first off the call to Start() passes the group, unique ID and the TTL (Time To Live) for this particular bit of caching. So the data will be uniquely identified on disk by the group/id combo, and will be considered stale after the TTL number of seconds have passed. This function returns true if the data is found in the cache, and also prints the data to the screen. This means the code inside the if() block is skipped (thanks to the not (!) operator), and so the data isn't printed twice.
If however, the given combo of group/id isn't found in the cache, the Start() method will return false. When this happens output buffering is turned on to record the output. The code inside the if() block will then run (again - the not (!) operator), generate the output (which gets buffered), and then call the End() method. This method stops output buffering, saves the data to disk, and then prints it.
Elegant, efficient, and sexy. What more could you ask for?
Data Cache
The DataCache is used to cache data structures, as opposed to script output. This allows you to cache the creation of large arrays for example, or the results of slow queries. This is helpful if your pages are rather dynamic, though some areas aren't. Or in a recently experienced situation of mine: You have one central DB server, and multiple front end webservers. A common setup. If the load is getting high on the database, you might want to move some portion of queries (ORDER BY RAND() is a good example) to the webservers instead of the database server. Thus randomisation (eg using shuffle()) happens on one of 5 webservers, instead of your single resource limited database server. Anyway, some code:
<?php
if (!$data = DataCache::Get("myGroup", "myOtherID")) {
$result = $db->query("SELECT BIG_ASS_QUERY()");
DataCache::Put("myGroup", "myOtherID", 600, $result);
}
// Do something useful with $result
?>
So in this example (very similar to output caching), if the data is cached, it's assigned to $data and the if() block is skipped. If not, then the if() block is run, and the data is cached at the Put() method call.
Miscellaneous Bits
There's a few configuration bits and bobs you can twiddle with if you like twiddling. setPrefix() as you can well imagine sets the prefix used in the cache data filenames. This defaults to "cache_". setStore() sets where the data files themselves are stored. This defaults to "/dev/shm/", since this is a convenient way to store the data files in shared memory. If you don't have this, try changing the path to "/tmp/". Must be given with a trailing slash.
And last, and least (so as not to be a corny ass), there's the static variable Cache::$enabled. That's how your refer to static class variables in case you didn't know. This is a boolean which enables or disables the cache. Surprising that.
C'est tout. Get the code here.
Posted: 5th April 2005 22:24
> You are a true GURU !
Ta very much.
Posted: 8th September 2005 19:43
> You get this month's paw-print of approval from
> me, Gormless George and the Three Suspects.
Is that a good thing? ;-)
Posted: 31st October 2005 17:24
Posted: 13th February 2006 02:25
if (!$data = DataCache::Get("myGroup", "myOtherID"))
be
if (!$result = DataCache::Get("myGroup", "myOtherID")) {
Thanks for the code. works great!!
Posted: 18th February 2006 12:21
Mail me if you want me to send it to you, I will be glad to. In a few days I will connect it to my HTTP-cache class so we could have both server and browser caching out-of-the-box.
Posted: 7th April 2006 16:25
Ps: April 9 Amazon will ship you a something from your wishlist.
Thanks again.
Posted: 25th July 2007 15:00
Posted: 3rd October 2008 06:41
One problem that I encountered (probably every cache techniques has the same flaw) was that when item expires, all users on the site tried to recache it at the same time.
I have extreamly big site - big queries, tons of requests.
So I introduced random :)
For an example TTL varies form 540 to 600 seconds (usually %10).
It really saved my ass, because my site was going up and down like a rabbit.
So instead of 10 second blackouts, 0.5 second blackouts encountered and only for one person.
The other thing was file lookups. Storing hundreds of thousands of files in one dir on disk (lets say you cant use memory) is not funny at all.
So I created md5 hash form filename and separated files between dirs.
For an example:
md5==9e107d9d372bb6826bd81d3542a419d6
-->
/tmp/9/e/[cache name].cache
or
/tmp/9/e/9e107d9d372bb6826bd81d3542a419d6.cache
It gives you 16*16 dirs with approx. 400 files in one dir if total file count is 100K.
md5 is only choosen that filenames will distribute evenly over dirs.
I could make md5 every time or precalc it (just be careful not to use same hash twice) i.e.
Cache::Start("cache_name", 600, 540, ''); //auto md5
Cache::Start("cache_name", 600, 540, '9e107d9d372bb6826bd81d3542a419d6'); //precalc
This cache system used two files per cache entry: "blaa" and "blaah.timestamp", but I was also thinking about using filemtime(), but then it will stop working with memcache if I'll try to retain my "random TTL caching method".
Posted: 3rd October 2008 06:41
One problem that I encountered (probably every cache techniques has the same flaw) was that when item expires, all users on the site tried to recache it at the same time.
I have extreamly big site - big queries, tons of requests.
So I introduced random :)
For an example TTL varies form 540 to 600 seconds (usually %10).
It really saved my ass, because my site was going up and down like a rabbit.
So instead of 10 second blackouts, 0.5 second blackouts encountered and only for one person.
The other thing was file lookups. Storing hundreds of thousands of files in one dir on disk (lets say you cant use memory) is not funny at all.
So I created md5 hash form filename and separated files between dirs.
For an example:
md5==9e107d9d372bb6826bd81d3542a419d6
-->
/tmp/9/e/[cache name].cache
or
/tmp/9/e/9e107d9d372bb6826bd81d3542a419d6.cache
It gives you 16*16 dirs with approx. 400 files in one dir if total file count is 100K.
md5 is only choosen that filenames will distribute evenly over dirs.
I could make md5 every time or precalc it (just be careful not to use same hash twice) i.e.
Cache::Start("cache_name", 600, 540, ''); //auto md5
Cache::Start("cache_name", 600, 540, '9e107d9d372bb6826bd81d3542a419d6'); //precalc
This cache system used two files per cache entry: "blaa" and "blaah.timestamp", but I was also thinking about using filemtime(), but then it will stop working with memcache if I'll try to retain my "random TTL caching method".
Posted: 30th December 2008 00:14
Posted: 6th March 2009 11:17
I am curious as to why your touching the cache files into the future instead of checking their ages and leaving the proper filedates. Is there a particular reasoning for this logic ?
Posted: 6th March 2009 11:27
> wow this is old but still very helpful, probably
> the simplest class to use. Its prefect and
> working wonderfully.
Glad to hear it. Feel free to buy a license... :-)
> I am curious as to why your touching the cache
> files into the future instead of checking their
> ages and leaving the proper filedates. Is there
> a particular reasoning for this logic ?
So it can be easily checked as to whether it's stale or not. The function isCached() uses it to test whether the file is cached and within the TTL.
Posted: 26th March 2009 00:55
fopen(/dev/shm/cache_lol_ffe553694f5096471590343432359e02) failed to open stream: No such file or directory in D:\Servidor\rnd\Cache.php
Does anyone have any idea what's wrong? I also tryed it on a linux server and got the same error.
Posted: 26th March 2009 05:32
> When I try it it give's me this error:
>
>
> fopen(/dev/shm/cache_lol_ffe553694f5096471590343432359e02)
> failed to open stream: No such file or directory
> in D:\Servidor\rnd\Cache.php
>
> Does anyone have any idea what's wrong? I also
> tryed it on a linux server and got the same
> error.
Well, you're using a unix path on a Windows box. Try setting the store to C:\windows\temp
Eg:
Cache::setStore('C:\windows\temp\');
Comments
Posted: 4th April 2005 18:19
Best Regards,
Dumitru MIHAI