PITS /
00966: Aggressive static page cache with mod_rewrite
Summary: Aggressive static page cache with mod_rewrite
Created: 2007-08-11 08:46
Status: Closed (Cookbook.FastCache)
Category: Feature
From: Thomas Bley
Assigned:
Priority: 544
Performance: Attach:with_static_cache.txt Attach:without_static_cache.txt
See Cookbook.FastCache for a possible implementation of this.
-- EemeliAro September 17, 2007, at 07:58 AM
Version: 2.2.0 beta
OS: Apache / mod_rewrite / PHP 5.2.3
Description:
Web-servers perform the best when they serve plain HTML without PHP/Perl/Java.
Using a mod_rewrite static cache by-passes PHP and brings up pages much faster.
That also speeds up dynamic pages because the server has more resources for them.
Apache's mod_rewrite can do conditional file_exists rules.
E.g. when a file like Pagename,cache.html exists in a cache directory,
it redirects to the cache file, else it goes to "pmwiki.php".
Often some web pages are dynamic, others are static.
A way to enable/disable this would be to use a setting similar to (:title:) :
(:static-cache on:)
or if static-cache is always on:
(:static-cache off:)
or (more cache control with expiry)
(:static-cache active=on expires=day/hour/month:)
Or (more cache control with checking dependencies)
(:static-cache active=on dependencies=pagename1,pagename2,@group1,@group2:)
Or (more cache control with checking dependencies regexp)
(:static-cache active=on dependencies_regexp=pagename?|group?\..+:)
I'm already using this on www.simple-groupware.de and it's working great.
Automatic cache expiry can be also done by adding variables to the filename
in the mod_rewrite rule, there are:
TIME_YEAR, TIME_MON, TIME_DAY, TIME_HOUR, TIME_MIN, TIME_SEC, TIME_WDAY, TIME
=> E.g. cache for 1 day:
RewriteRule ^([^/a-z].*) cache.s/cms/$1_%{TIME_DAY}_%{TIME_MON}_%{TIME_YEAR}_.html [QSA,L]
The code would be sth like this:
function HandleBrowse($pagename, $auth = 'read') {
...
$FmtV['$PageText'] = MarkupToHTML($pagename, $text, $opt);
if ($EnablePathInfo) $pagename = str_replace(".","/",$pagename);
// don't cache post, ?action=xyz
$active = false;
if (empty($_POST) and empty($_SESSION) and (empty($_GET) or array_keys($_GET)==array("n")))
$active = true;
if (<page_has_static_param_on> and $active and <page doesnt require authentication> and
<lastmod_pagename greater than lastmod_cache.s/$pagename,cache.html>
) {
<foreach dependencies as $dep_pagename { unlink "cache.s/$dep_pagename,cache.html" }>
if ($EnablePathInfo) mkdirp(dirname("cache.s/$pagename,new"));
if ($fp = @fopen("cache.s/$pagename,new", "x")) {
fwrite($fp, <full_page_content>);
fclose($fp);
rename("cache.s/$pagename,new", "cache.s/$pagename,cache.html");
}
}
# Some example mod_rewrite rules:
# If charset needs to be UTF-8
AddCharset UTF-8 .html
# using $EnablePathInfo = 0;
# don't cache if a query is given (e.g. action=edit), only cache get request
RewriteCond %{REQUEST_METHOD} ^GET$
RewriteCond %{DOCUMENT_ROOT}/cms/cache.s%{QUERY_STRING}.html -f
RewriteRule pmwiki.php?n=([^/a-z].*) cache.s/$1,cache.html [QSA,L]
# use pmwiki.php if no cache is available / dynamic page
RewriteRule ^([^/a-z].*) pmwiki.php?n=$1 [QSA,L]
# using $EnablePathInfo = 1;
# don't cache if a query is given (e.g. action=edit), only cache get-request
RewriteCond %{REQUEST_METHOD} ^GET$
RewriteCond %{QUERY_STRING} ^$
RewriteCond %{DOCUMENT_ROOT}/cms/cache.s%{REQUEST_URI},cache.html -f
RewriteRule ^([^/a-z].*) cache.s/cms/$1.html [QSA,L]
RewriteRule ^([^/a-z].*) pmwiki.php?n=$1 [QSA,L]
From martin:
An option to clear the static folder cache (cache.s/*) using: ?action=clearStaticPages
Definition of normal PageCache:
- should handle markups dealing with the user's identity ($Author, $Authid, ReadProtectedPage, etc.)
- should handle time and date dependent markup (now, today, tomorrow, etc.)
- should handle randomized markup (rand, captcha, etc.)
- should handle markup like (:title:) correctly
- always active if mod_rewrite doesn't finish the request before
Definition of static PageCache used by this ticket:
- active if (:static ... :) is used
- should only handle markups dealing with the anonymous users (unauthorized)
- should handle time and date dependent markup once for building the cache
- should handle randomized markup once for building the cache (rand, captcha, etc.)
- should handle markup like (:title:) once for building the cache
- if the user doesn't want static PageCache, he simply turns (:static ... :) off
Regards,
Thomas