PHP Page Caching – Serving Static Files

Just yesterday I have updated my app to cache pages and serve as static files until updates arrive. It took me almost several hours to finalized and now it is working quite well. Using Kohana v3, it saves a page into a static HTML file and let apache serve the static file the next time the page is requested, thus giving more performance gain and will help my account not get suspended (at least less chance).

The Process

The entire modification process is pretty simple and easy to implement. Given that Kohana is so flexible and easy to extend, it took me few minutes to come up the the first draft. And later on it gets refactored and optimized so I will be posting the final version. We can summarize the process this way:

  1. Saving cached pages
  2. Apache rewrite
  3. Cleaning up

The Kohana side – saving cached pages

First, I created a structure where to save the cached pages. I decided to put it under the cache directory, under the application directory:

application
  cache
    <kohana cache files>
    page
      <page cache files>
    .htaccess
  <the rest of the structure>

As you can see, I am going to save the cached pages under application/cache/page. I also added an .htaccess file that will tell apache to cache the HTML files at a certain lifetime in seconds.

Next, I have created the caching class called Pagecache. The file is saved directly under the classes directory – resulting to a short class name because I’m lazy. Here is the full content:

application/classes/pagecache.php

<?php defined('SYSPATH') or die('No direct script access.');

class Pagecache
{
	const CACHE_PATH = 'cache/page';
	
	/**
	 * File name
	 *
	 * @var string
	 */
	protected $_file;
	
	/**
	 * Factory pattern for creating page cache
	 *
	 * @param string $uri
	 * @return Pagecache
	 */
	public static function factory($uri)
	{
		return new self($uri);
	}
	
	/**
	 * Cleans the whole cache
	 *
	 * @return void
	 */
	public static function cleanup()
	{
		$path = APPPATH . self::CACHE_PATH;
		// only delete files
		return self::_delete_all($path, true);
	}
	
	/**
	 * Deletes files and directories recursively
	 *
	 * @param string $directory		target dir
	 * @param boolean $empty		whether to delete the dir or just empty it
	 * @return boolean
	 */
	protected static function _delete_all($directory, $empty = false)
	{
		// always check since we could accidentally delete root
		if ($directory == '/')
		{
			return false;
		}
		
		// remove trailing slash
		if(substr($directory,-1) == "/")
		{ 
			$directory = substr($directory,0,-1); 
		} 
		
		// should be a valid dir
		if(!file_exists($directory) || !is_dir($directory))
		{ 
			return false; 
		}
		
		// dir should be readable
		if(!is_readable($directory))
		{ 
			return false; 
		}
		
		$directoryHandle = opendir($directory); 
	
		while ($contents = readdir($directoryHandle))
		{ 
			if($contents != '.' && $contents != '..')
			{ 
				$path = $directory . "/" . $contents; 
	
				if(is_dir($path))
				{ 
					self::_delete_all($path); 
				}
				else
				{
					unlink($path);
				} 
			} 
		}
	
		closedir($directoryHandle); 
	
		if($empty == false)
		{ 
			if(!rmdir($directory))
			{ 
				return false; 
			} 
		} 
	
		return true; 
	}
	
	/**
	 * __construct()
	 *
	 * @param string $uri
	 * @return void
	 */
	protected function __construct($uri)
	{
		$this->_init_file($uri);
	}
	
	/**
	 * Initializes the file based on the uri
	 *
	 * @param string $uri
	 * @return $this
	 */
	protected function _init_file($uri)
	{
		$paths = explode('/', $uri);
		$base = APPPATH . self::CACHE_PATH;
		
		// create base path under the cache dir
		if (!is_dir($base))
		{
			mkdir($base, 0777);
			chmod($base, 0777);
		}
		
		// create the path to uri except for index.html
		$path = $base;
		foreach ($paths as $sub)
		{
			$path .= "/$sub";
			if (!is_dir($path))
			{
				mkdir($path, 0777);
				chmod($path, 0777);
			}
		}
		
		// cached page
		$this->_file = "$path/index.html";
		if (!file_exists($this->_file))
		{
			// Create the cache file
			file_put_contents($this->_file, '');
			
			// Allow anyone to write to log files
			chmod($this->_file, 0666);
		}
		
		return $this;
	}
	
	/**
	 * Writes to cache
	 *
	 * @param string $data
	 * @return $this
	 */
	public function write($data)
	{
		file_put_contents($this->_file, $data);
		return $this;
	}
	
	/**
	 * Deletes a cached page
	 *
	 * @return boolean
	 */
	public function delete()
	{
		return unlink($this->_file);
	}
}

There’s nothing special in this code. The main methods responsible for caching are:

  • factory() – because this class uses the factory pattern to initialize a cached page
  • _init_file() – protected, initializes the page such as creating subdirectories depending on the URL
  • write() – writes the page to cache
  • cleanup() – removes all cached pages

The next step is to modify the behavior of our controller. In Kohaha, it is pretty easy to capture the whole response / output via the after() method. I have to create a new template controller that automatically caches the page – and this controller will be extended by the controllers that needs to be cached. Child controllers does not even need to know that their response is cached.

application/classes/controller/cached.php

<?php defined('SYSPATH') or die('No direct script access.');

/**
 * Cached pages
 *
 */
abstract class Controller_Cached extends Controller_Site
{	
	public function after()
	{
		parent::after();
		
		Pagecache::factory($this->request->uri)
			->write($this->request->response);
	}
}

It extends my template controller which handles the main design of all pages in the front end. It calls first the parent’s after method so that the response is generated in advance. And when the response is ready, we save the cache by calling the Pagecache’s write() method. The response is saved into the cache directory ready to be served by Apache.

The next step is to modify the controllers that need to use caching. We will simple extend Controller_Cached controller and we’re done.

// main page
class Controller_Index extends Controller_Cached {}

// individual post page
class Controller_Post extends Controller_Cached {}

// about page
class Controller_About extends Controller_Cached {}

// and the rest

Controller actions and other methods need not be modified. Other controllers that does not need to be cached will work as usual. So what happens to those cached pages?

Apache rewrite

The rest of the job is for Apache. I have to modify Kohana’s recommended .htaccess and work on some regex job that I really really hate and still not trying to learn regex until now. I searched and found a rewrite rule that works from a Zend Framework tutorial on page caching.

.htaccess

# Turn on URL rewriting
RewriteEngine On

# Installation directory
RewriteBase /

# Protect hidden files from being viewed
<Files .*>
	Order Deny,Allow
	Deny From All
</Files>

# BEGIN Page cache

RewriteRule ^/(.*)/$ /$1 [QSA]
RewriteRule ^$ application/cache/page/index.html [QSA]
RewriteRule ^([^.]+)/$ application/cache/page/$1/index.html [QSA]
RewriteRule ^([^.]+)$ application/cache/page/$1/index.html [QSA]

# END Page cache

# Protect application and system files from being viewed
RewriteRule ^(?:[application/cache/page]application|modules|system)\b - [F,L]
# The [application/cache/page] seems not working

RewriteCond %{REQUEST_FILENAME} -s [OR]
RewriteCond %{REQUEST_FILENAME} -l [OR]
RewriteCond %{REQUEST_FILENAME} -d 

RewriteRule ^.*$ - [NC,L]
RewriteRule ^.*$ index.php [NC,L]

It simply serve the cached version of the requested URL if it exists on the cache page. Under application/cache I have to put another .htaccess so that those HTML files will be cached by browsers.

application/cache/.htaccess

# BEGIN supercache

<IfModule mod_headers.c>
  Header set Cache-Control 'max-age=300, must-revalidate'
</IfModule>
<IfModule mod_expires.c>
  ExpiresActive On
  ExpiresByType text/html A300
</IfModule>

# END supercache

We need not to cache too long since we have to update feeds every 10 minutes.

Cleaning up

This is a design decision – I will delete all cached pages when new feeds are imported. After some searching, I come up with a solution that can delete a directory content recursively. It is included in the Pagecache class as static method. It will empty the application/cache/page directory.

I put it on import process and when there are imported feeds, cache is cleared.

if ($importer->import_count() > 0)
{
	// clear rss cache
	Model_Feed::clear_rss($this->_user_id);
	
	// clear page cache
	Pagecache::cleanup();
}

You can view the site here: http://ff2fb.lysender.co.cc/

This entry was posted in Apache, Kohana v3 and tagged , , , , , , , , . Bookmark the permalink.

Related Posts

3 Responses to PHP Page Caching – Serving Static Files

  1. Nike Dunks says:

    Everyone should really browse far more details about the entire globe, then the stage of one’s knowledge must be improved. And just be assured is the only thing you are able to do now. Your website is so fantastic that give lots of readers a lot of pleasure.!Puma Clyde and Puma shoes with high quality, fashion style and competitive price.

  2. Pingback: Kohana Module – Pagecache | Lysender's Daily Log Book

  3. Pingback: Kohana 3.1 Migration – Custom Error Pages | Lysender's Daily Log Book

Leave a Reply

Your email address will not be published. Required fields are marked *