SimplePageCache
<< | Cookbook-V1 | >>
Note: The recipes here are for PmWiki versions 0.6 and 1.0 only. For PmWiki 2.0 recipes, see Cookbook.
Goal
Provide caching functionality for Wiki pages, so that the HTML output does not have to be rebuilt each time.
Solution
Download cache.php.
Installation
Put cache.php in your local/ directory and add
include_once("local/cache.php");
to your local.php file. Temporarily change the permissions on the directory containing pmwiki.php to 2777, then use your browser to visit the homepage (or any other page) of the Wiki. This will create a wiki.cache directory with the right permissions. Change the permissions back afterwards.
Discussion
This simple script was inspired by the discussion on Development.WikiCaching. In order to avoid rebuilding the HTML of a page each time a page is read, the output is captured and stored in a cache directory. Any time after the first, the cached version is served directly from disk instead.
To avoid serving stale versions of a page, the entire cache is invalidated each time a "post", "postattr" or "postupload" action is performed (or any other action that begins with "post", in order to allow WikiAdministrators to add post functionality of their own without changing the code). Also, the special action "resetcache" is provided to invalidate the cache without modifying files.
The cache is also invalidated if any file in the "local/" subdirectory is changed. This way, one can develop code without having to invalidate the cache manually each time.
You can use the browser directive [[nocache]] to disable caching for an individual page. You can also choose to just enable it for a certain group or page by including the script in the configuration file for that script or page.
Note that unless you are running on an old machine, one shared among many users, use complex markup or have long pages, you probably won't need this script. On a modern machine, PmWiki/PmWiki will serve a simple page generally in less than 0.1 seconds without any caching support.
Implementation Details
The caching mechanism creates a wrapper around the HandleBrowse() function that uses ob_start() and its companion functions to capture the results. Because HandlePost() is called from HandleEdit(), the post actions have to be enumerated explicitly, based on $action and $HTTP_POST_VARS.
Configuring the Script
The caching mechanism can be enabled and disabled by setting or clearing the $EnableCache flag. If you set the $DebugCache flag, the page title will include the text [Cached] if a cached version has been served.
A third flag is $CacheAuthCheck. Normally, the caching mechanism assumes that all pages are readable by everybody, and would therefore serve cached pages even to readers that are not authenticated. By setting $CacheAuthCheck, authentication is strictly enforced (even for included pages), at the cost of a small overhead. Given that most Wikis do not have read-protected pages, the flag is off by default to get maximum performance. A compromise between speed and security is to only include the script in groups that do not have read-protected pages.
Important: If you are using the scripts/sessionauth.php script, you have to include it before including cache.php, because cache.php hooks itself into the authentication mechanism and will not know about sessionauth.php if it is included later.
Contributors
- Reimer Behrends
Comments
I found that after .touched
was created its filemtime never changed. This problem was fixed when I altered InvalidateCache() to write to the file. Apparently under Darwin (Mac OSX), unlike Linux, merely clobbering an empty file does not update its modification time. I also changed InvalidateCache() to use $CacheTimeStamp:
function InvalidateCache() { //global $CacheDir; global $CacheTimeStamp; //$fp = fopen("$CacheDir/.touched", "w+"); $fp = fopen("$CacheTimeStamp", "w+"); fwrite($fp, '.'); fclose($fp); }
--Fred Henle -> mailto:henlef [snail] mercersburg [period] edu
Actually, why not simply use touch() to touch .touched
?
function InvalidateCache() { global $CacheTimeStamp; touch("$CacheTimeStamp"); }
--Fred Henle -> mailto:henlef [snail] mercersburg [period] edu
I've changed cache.php to use touch().
-- Reimer Behrends
Great! I have a suggestion for a slight change in how pages are served after the cache is invalidated. The reason I want caching is that my most complex page takes 13 seconds to generate and serve. Before I started using SimplePageCache, the browser would start rendering the (partial) page immediately, and update it continually until it was done. With SimplePageCache, when that page has to be regenerated, the browser doesn't see anything for 13 seconds, then the whole thing appears when ready. I think this is because the output buffer isn't flushed until ob_end_flush() is called at shutdown. I see two potential solutions:
- Use ob_implicit_flush() to send the page to the browser as it is generated. I tried putting in a call to ob_implicit_flush() but I got the infamous "headers already sent" error so I must have been doing something wrong.
- Serve the (possibly slightly out of date) cache page with an immediate refresh/redirect to the regenerated page. That way the old page appears almost instantaneously, to be replaced by the new page whenever it's ready. There must be several ways to manage this.
I'm willing to try implementing the second option if there's no easy way to get the first option to work....
--[(approve links) edit diff]
Okay, I have a small diff for cache.php
which seems to work for me:
90,91c90,92 < if ($cachetime > $lastchange) < { --- > if (file_exists("$CacheFile.redo")) { > unlink("$CacheFile.redo"); > } else { 108a110,113 > if ($cachetime < $lastchange) { > touch("$CacheFile.redo"); > $contents = str_replace("</head>", "<meta http-equiv='Refresh' content='0; URL=$PHP_SELF' /></head>", $contents); > }
It serves the invalidated cache page with an immediate refresh to the regenerated page. I don't know if there's a better way to do this....
--[(approve links) edit diff]
I just noticed a problem with my patch, which is that the cached page displays before authentication. I'll have to try to figure out how to prevent that....
--[(approve links) edit diff]
The following patch should handle incremental output better. It flushes the output once per input line. I'll fold it into the main file as soon as it has seen some more extensive testing.
38a39,48 > $DoubleBrackets['/^/e'] = 'IncrementalFlushCache()'; > $CacheStoreIncremental = ''; > > function IncrementalFlushCache() > { > global $CacheStoreIncremental; > $CacheStoreIncremental .= ob_get_contents(); > ob_flush(); > } > 66c76,77 < $CacheSavedAuth, $CacheReadAuthPages, $CacheFailedAuth; --- > $CacheSavedAuth, $CacheReadAuthPages, $CacheFailedAuth, > $CacheStoreIncremental; 125c136,137 < "included" => $CacheReadAuthPages, "html" => ob_get_contents()); --- > "included" => $CacheReadAuthPages, > "html" => $CacheStoreIncremental . ob_get_contents());
-- Reimer Behrends
I had a problem that in cached pages the pagename was written just before the <DOCTYPE .. - declaration.
After searching a while the line:
// Send our own headers, not the PHP default. PrintFmt("headers:", $pagename);
seemed to be the problem. I'm not sure what is done here. But commenting out this line helps. Used with pmwiki 0.6.14.
-- Svogel
Things might actually work a lot faster, even, if you set up PmWiki as an Apache ErrorDocument
handler, so it only gets called at all if the cache page doesn't exist. See FourOhFourCache for more information about this technique. --EvanProdromou
This recipe works for 0.6.17 except for one thing. The Search function is hijacked and rendered usless when SimplePageCache is enabled. I had to add [[nocache]] to the top of the Search page. -- TreverMiller pmwiki-2.3.38 -- Last modified by {{Arrowman}}?