Why should you cache your PHP website?

Most web servers are able to handle “normal” traffic and there are plenty of websites which doesn’t have so much traffic. So maybe you ask yourself: Why should you cache your PHP powered website? The apache web server is able to serve many, many files at the same time, but all these files need to be static. A PHP script is parsed by the web server and next the generated HTML data is send to the client (web browser). While this happens the server need to use much more memory than by sending a file to a web client. Imagine what happens if you run/parse a page which is build with WordPress…

The web has not only human visitors!

If your WordPress or PHP site has a few visitors within an hour a web server should be able to serve ALL pages to your visitors without any problems. So far so good, but what if your site get accessed by some bot? The most bad scenario is that these “unnatural” access can slow/take down your site and also all other sites hosted on the same server!

WordPress Super Cache, a required WP plugin

Even if your blog doesn’t have a lot of blog post or comments, you should install the WordPress Super Cache plugin. This plugin works on most servers and can rescue your blog’s life! WordPress needs a lot of database queries to show a single page to your visitors. Each database connection needs some memory and will use some CPU. Using this cache plugin a “normal” viewed page doesn’t use the database anymore and your server can handle much more traffic.

Cache functions for custom PHP websites

There are many ways to cache your website, there are several cache modules available or it’s possible to create a cache version of each page using some PHP code. Which the best is for your situation depends on the application and the type of hosting you’re using.

The eAccelerator project

If you’re able to configure your web server (you need root access) you should try the eAccelerator project. It works as kind of PHP extension and is able to create a cache version of your PHP scripts. I installed and updated eAccelerator on two web servers now and I like it how it works. Before you start you should check the requirements and maybe you like to read my notes about the eAccelerator installation.

Custom PHP file caching (tutorial)

If you’re looking for a way to cache single pages from your website you should try this tutorial. The simple code snippet is able to “download” the HTML code from the selected page and stores the HTML code as a static page. The following code will check, read/write and output the cache version (check the comments inside the code).

<?php
// a function to receive an write some data into a file
function get_and_write($url, $cache_file) {
	$string = file_get_contents($url);
	$f = fopen($cache_file, 'w');
	fwrite ($f, $string, strlen($string));
	fclose($f);
	return $string;
}
 
// a function that opens and and puts the data into a single var
function read_content($path) {
	$f = fopen($path, 'r');
	$buffer = '';
	while(!feof($f)) {
		$buffer .= fread($f, 2048);
	}
	fclose($f);
	return $buffer;
}
 
$cache_file = '/home/user/public_html/cache/cache.page.php';
$url = 'http://www.domain.com/page.php';
 
if (file_exists($cache_file)) { // is there a cache file?
    $timedif = (time() - filemtime($cache_file)); // how old is the file?
     if ($timedif < 3600*24) { // get a new file 24 hours
        $html = read_content($cache_file); // read the content from cache
    } else { // create a new cache file
        $html = get_and_write($url, $cache_file);
    }
} else { // no file? create a cache file
    $html = get_and_write($url, $cache_file);
}
echo $html;
exit;
?>

The code above is pretty simple but not dynamic. To get this example working, you need to create a list for the cache files and for the URL’s. In the next example we use some mod_rewrite rules to match our file structure.

In our case we use a part from the URL and pass this string as a variable using the query-string. Here are some example URLs:

http://domain.com/page/great-post-about-scripts

http://domain.com/page/php-upload-tutorial

http://domain.com/page/jquery-plugin-review

Inside our .htaccess file we use this rule:

RewriteEngine on
RewriteRule ^page\/([a-z\-]*)$ /page.php?pageurl=$1 [L]

The rewrite engine will pass anything after “page/” to the page.php file as a query-string. Inside the file page.php we need some code to automate our request and generate the cache version.

if (!empty($_GET['pageurl']) && preg_match('/^[a-z\-]*$/', $_GET['pageurl'])) {
    $cache_file = '/home/user/public_html/cache/cache-'.$_GET['pageurl'].'.php';
    $url = 'http://www.domain.com/page.php?page='.$_GET['pageurl'];
} else {
	header ("HTTP/1.1 301 Moved Permanently");
    header("Location: ".$url);
    exit;
}

Place this IF/ELSE statement below the variables $cache_file and $url. The code we’ve used inside the tutorial is just an example, you need to change the code to get it working in your situation. Download the tutorial code for a quick start. If you have questions, comments or suggestions, please post them below, thanks!

Comments

  1. Awesome tutorial… This caching technique will definitely reduce server load on a custom cms

    Thanks..

  2. I think your PHP approach is very simplistic. The difficult about caching isn’t caching itself, but cache invalidation. If your application can’t handle a single, well-behaved bot (like a Google indexer) that means it’s incorrectly configured or poorly designed.

    If your pages by design take very long to generate (i.e. you need to parse a lot of data or do remote API calls), you should pre-cache at intervals. Generating on demand will simply fail when more than one client tries to open the page at the same time.

    If your pages change rarely (which in itself is increasingly rare because of comments etc.) it’s better to invalidate cache on change instead of at intervals.

  3. Hello Pies,

    thanks for your comment!
    You’re right the code example is very simple (maybe too simple). The first reason I wrote this article was mention that php caching is important even for low traffic sites.

    I agree to cache API calls because, they can slow down your site even more. Interesting is your suggestion to update the cache when the content is changed, that makes sense. Every CMS should do that (even WordPress). My example was for the basic website without a CMS but using PHP/MySQL code to show some generic content.

    You wrote:

    Generating on demand will simply fail when more than one client tries to open the page at the same time.

    Do you say I need to lock the file while the server is updating the cache version?

  4. Hello Sven, thanks for sharing this link, interesting results btw. ;)

  5. I did a little different approach of reading cache files,
    using Apache and C++ to avoid PHP,
    check it out:
    http://sven.webiny.com/advanced-cache-mechanism-using-php-cpp-and-apache/

  6. Nice article. One other thing that can really help if you have root access (VPS or dedicated server) is utilizing a front-end/proxy cache like Varnish. This will totally bypass WordPress/PHP as well as Apache loading.

    One problem I do have with WordPress caching solutions is that if you use WordPress comments, allow anonymous commenting and as soon as they comment, that user is no longer served cached pages. That’s due to the auto-fill features of the comment form. One solution for WordPress would be to remove that feature from inside of PHP and use a totally javascript approach (this is how Drupal does it).

  7. Hello Jamie,

    Thanks for the suggestion. On the Varnish “about” page I found this information:

    Varnish store web pages in memory so the web servers don’t have to create the same web page over and over again.

    Do you need more memory instead?

  8. Nice and helpfull article. I alos have build an custom PHP-dba caching mechnisam. You can find it at
    https://github.com/gjerokrsteski/php-dba-cache
    The php-dba-cache uses the database (dbm-style) abstraction layer to cache your objects, strings, integers or arrays.
    Even instances of SimpleXMLElement can be put to the cache.
    The advantage of dba-caching is that you dont have to matter about the size of the file. It depends of the free space of your disk-space. So, try it out an mail me your experience.

  9. Hello Gjero,

    thanks for your information. I never used the dba functions before, do you mention any resources on your website about how-to install the php extension?

  10. @Olaf:
    I have no documentation about how to install an dba-extension in php. But I think that you will come up with this links:

    Installing extentions in PHP
    http://www.php.net/manual/en/install.php

    DBA extention Installation
    http://www.php.net/manual/en/dba.installation.php

  11. eee, eAccelerator? APC is a better

  12. Hi Matipl,
    does APC work with suPHP?

  13. Hi,

    Nice article, good read.

    The only one I’d add is pear cache lite – that reduces your dynamic pages to flat files, also deals with the file locking issue mentioned above and has an option to delete a cache file (for use in CMS’s). Additionally, you can cache blocks of content too.

    Very easy to setup and use aswell.

    Rob

  14. Hello Rob,

    thanks for the tip! I’m struggling with the lock problem, let’s say I would change the function to:


    function get_and_write($url, $cache_file) {
    $string = file_get_contents($url);
    flock($f, LOCK_EX);
    ftruncate($f, 0);
    $f = fopen($cache_file, 'w');
    fwrite ($f, $string, strlen($string));
    flock($fp, LOCK_UN);
    fclose($f);
    return $string;
    }

    Would this solve problem?

  15. Thanks for sharing! I just installed the WordPress plugin on my site and waiting to see the results :)

  16. Another reason not mentioned why caching is a good idea is if your data comes from a remote API. For example if you are serving content from the Amazon API it can really help server load to cache all data once and then every 7 days for example.

  17. Hello Lucas,

    you’re right using an API without caching the response will slow down your site and maybe you have problems with your data provider too ;)

  18. Great post;

    This is a nice, simple starting point. To those looking seriously in to caching PHP, I recommend looking in to database caching. File caching will be difficult under high load, and most database servers will keep your information in memory (if there’s high enough load to be worthy of caching, that is!)

    I’m adding a trackback to this post in my blog entry tonight on AdvancedCodingConcepts.blogspot.com as a weekly blog entry of note, attached to my article about application plugin interface development. My articles are language-independent when possible, even though my application platform itself is in PHP.

  19. Very nice post thanks, easy to forget how much caching effects a websites responsivenss (and a computers for that matter). Which ultimately leads to bad end-user experiences/loss of sales….you get the picture.

  20. Yup,

    More than agree. I ve experienced down error when using adwords to attracts more visitors to my site. It was fine when its quiet.

    Just got super cache installed but am looking at w3 cache now, heard that it works better but uncertain of the reason. Any idea?

  21. Super Cache is very easy to config. W3 total cache is more popular and maybe better? Today I start o serve my static file using MAX CDN and they are providing a plugin too (could be a reason to switch)

  22. great!!!
    Very nice blog that use very simple PHP code.Thanks for sharing.

  23. This is an interesting post, but I worry that the use of caching overlooks the issue of badly written code. Whilst I agree that 3rd party feeds and connections are obvious candidates for caching of the external data I would argue that this is just a sensible approach to coding in the first place.

    I am not really sure why you should need to cache standard pages served using dynamic database lookups. As long as the code that generates the page is efficient and the database is indexed in the correct way then the serving of the page shouldn’t be an issue.

    The workaround using PHP/Htaccess to look for a cached file (or to regenerate it) seems a bit cumbersome when simply rewriting the underlying code could resolve all the issues in a much better way.

  24. Hello Simon,
    thanks for your comment. I started this article because I needed a solution to stop some bots very quickly. Sure the php/htaccess solution is very basic and I think it’s the wrong method for many sites. In my own case I started to rebuild the website where this code is active right now :(

    On the other site I think it was important to mention that “caching” is important even for sites with less traffic.

  25. Hi Olaf,

    Sorry, it wasn’t meant to be a criticism of you, rather a warning for people who may avoid looking for problems in their code by trying to use caching to increase speed rather than looking at the root of the problem.

    I was inspired by your article to write a blog on simple file caching for 3rd party APIs etc. A bit more of a narrow remit, but I think your readers may be interested in it. I reference you in the article with a link to your blog, so was wondering if you would mind me adding a link here? The blog is at http://www.bluelinemedia.co.uk/blog/howtospeedupthirdpartyscriptsonphpwebsites. There is an associated demo and class for people to use too.

    Thanks
    Simon

  26. Hello Simon,
    Sorry for the late comment here, but I was very busy the last days :)
    It’s funny your the second reader who comments that a website owner should care about his code more than “just” having a cached version from his website. I think I’m building websites for such a long time that I think that is a normal situation and didn’t mention it in this article hehe.

    Your class is written very well and I think even the PHP beginner will understand the concept, thanks for sharing!

  27. Do you know of a good way to test the page rendering time to help decide if you should build in some caching functions? The ease of doing it in wordpress makes it an easy choice, but on a custom back-end, it would require a little more dev time.

  28. Hi,
    you can use some this code to test the page rendering time:

    // put that code at the top of the script
    function microtime_float() { 
       list($usec, $sec) = explode(" ", microtime()); 
       return ((float)$usec + (float)$sec); 
    } 
    $time_start = microtime_float();
     
    // place here your other code
     
    $time_end = microtime_float();
    echo $duration = $time_end - $time_start;

    You should cache your sites not only for speed, it’s much better for overall performance of your web server.

Because of all the spam attemps I've decided to close the comment form at this time. If you have have any questions or comments please post them by using Google+ or Twitter (the links to my profiles are located at the top of this page).