PITS /
00966: Aggressive static page cache with mod_rewrite
Summary: Aggressive static page cache with mod_rewrite
Created: 2007-08-11 08:46
Status: Closed (Cookbook.FastCache)
Category: Feature
From: Thomas Bley
Assigned:
Priority: 544
Performance: Attach:with_static_cache.txt Attach:without_static_cache.txt
See Cookbook.FastCache for a possible implementation of this.
-- EemeliAro September 17, 2007, at 07:58 AM
Version: 2.2.0 beta OS: Apache / mod_rewrite / PHP 5.2.3 Description: Web-servers perform the best when they serve plain HTML without PHP/Perl/Java. Using a mod_rewrite static cache by-passes PHP and brings up pages much faster. That also speeds up dynamic pages because the server has more resources for them. Apache's mod_rewrite can do conditional file_exists rules. E.g. when a file like Pagename,cache.html exists in a cache directory, it redirects to the cache file, else it goes to "pmwiki.php". Often some web pages are dynamic, others are static. A way to enable/disable this would be to use a setting similar to (:title:) : (:static-cache on:) or if static-cache is always on: (:static-cache off:) or (more cache control with expiry) (:static-cache active=on expires=day/hour/month:) Or (more cache control with checking dependencies) (:static-cache active=on dependencies=pagename1,pagename2,@group1,@group2:) Or (more cache control with checking dependencies regexp) (:static-cache active=on dependencies_regexp=pagename?|group?\..+:) I'm already using this on www.simple-groupware.de and it's working great. Automatic cache expiry can be also done by adding variables to the filename in the mod_rewrite rule, there are: TIME_YEAR, TIME_MON, TIME_DAY, TIME_HOUR, TIME_MIN, TIME_SEC, TIME_WDAY, TIME => E.g. cache for 1 day: RewriteRule ^([^/a-z].*) cache.s/cms/$1_%{TIME_DAY}_%{TIME_MON}_%{TIME_YEAR}_.html [QSA,L] The code would be sth like this: function HandleBrowse($pagename, $auth = 'read') { ... $FmtV['$PageText'] = MarkupToHTML($pagename, $text, $opt); if ($EnablePathInfo) $pagename = str_replace(".","/",$pagename); // don't cache post, ?action=xyz $active = false; if (empty($_POST) and empty($_SESSION) and (empty($_GET) or array_keys($_GET)==array("n"))) $active = true; if (<page_has_static_param_on> and $active and <page doesnt require authentication> and <lastmod_pagename greater than lastmod_cache.s/$pagename,cache.html> ) { <foreach dependencies as $dep_pagename { unlink "cache.s/$dep_pagename,cache.html" }> if ($EnablePathInfo) mkdirp(dirname("cache.s/$pagename,new")); if ($fp = @fopen("cache.s/$pagename,new", "x")) { fwrite($fp, <full_page_content>); fclose($fp); rename("cache.s/$pagename,new", "cache.s/$pagename,cache.html"); } } # Some example mod_rewrite rules: # If charset needs to be UTF-8 AddCharset UTF-8 .html # using $EnablePathInfo = 0; # don't cache if a query is given (e.g. action=edit), only cache get request RewriteCond %{REQUEST_METHOD} ^GET$ RewriteCond %{DOCUMENT_ROOT}/cms/cache.s%{QUERY_STRING}.html -f RewriteRule pmwiki.php?n=([^/a-z].*) cache.s/$1,cache.html [QSA,L] # use pmwiki.php if no cache is available / dynamic page RewriteRule ^([^/a-z].*) pmwiki.php?n=$1 [QSA,L] # using $EnablePathInfo = 1; # don't cache if a query is given (e.g. action=edit), only cache get-request RewriteCond %{REQUEST_METHOD} ^GET$ RewriteCond %{QUERY_STRING} ^$ RewriteCond %{DOCUMENT_ROOT}/cms/cache.s%{REQUEST_URI},cache.html -f RewriteRule ^([^/a-z].*) cache.s/cms/$1.html [QSA,L] RewriteRule ^([^/a-z].*) pmwiki.php?n=$1 [QSA,L] From martin: An option to clear the static folder cache (cache.s/*) using: ?action=clearStaticPages Definition of normal PageCache: - should handle markups dealing with the user's identity ($Author, $Authid, ReadProtectedPage, etc.) - should handle time and date dependent markup (now, today, tomorrow, etc.) - should handle randomized markup (rand, captcha, etc.) - should handle markup like (:title:) correctly - always active if mod_rewrite doesn't finish the request before Definition of static PageCache used by this ticket: - active if (:static ... :) is used - should only handle markups dealing with the anonymous users (unauthorized) - should handle time and date dependent markup once for building the cache - should handle randomized markup once for building the cache (rand, captcha, etc.) - should handle markup like (:title:) once for building the cache - if the user doesn't want static PageCache, he simply turns (:static ... :) off Regards, Thomas