SpamFilters
Description
Automatic blocking of some spambots.
The recipe offers a way to block a number of spambots (programs posting spam on wikis and forums). Four methods are used: a honeypot, blocking HTML links, analysis of the edit summaries and post size.
Honeypot: some of the spambots will try to fill all fields of the edit form. We will add two hidden form fields, invisible for a human user. When the form is submitted, if the fields are filled or modified, then this is very likely a spambot trying to post, so we refuse to save the form.
Blocking HTML links: if the posted text contains raw HTML links like <a href=...>
which are not escaped with [=...=]
or [@...@]
then the edit form is blocked. Some spambots try to post raw HTML; even if it wouldn't work in a wiki page, cleaning it would be annoying, so we just block it and issue a message for a human user on how to escape the HTML in order to save the page.
Edit summaries: some spambots will fill the "edit summary" field with random uppercase and lowercase characters (examples: [1], [2], [3]). We will block most of these posts if the edit summary doesn't look like a word or a sentence -- mixed upper-lowercase letters, too many consonants without a vowel. Note that sometimes, a real user may be blocked, with a message to change the edit summary, and sometimes, a spambot may post successfully, but this filter works in most cases. The code proposed below will allow $MixedCaseVariables
and `EscapedText
, that is, if the filter blocks your page summary, insert a backtick `
before the words, functions or variables that do not look like language.
Post size: some spambots will deface a page, replacing the content with a short paragraph with links. We can block the saving of the posted content if it is less than half of the previous content. Note that this may be annoying for real users/admins who try to refactor, cleanup or delete some pages, so we enable it only for specific pages which are often defaced.
Empty groups: some spambots create pages in new wikigroups, and in order to cleanup the mess, one has to delete the spam pages, delete the group recent changes page, and possibly cleanup the Site.AllRecentChanges page. We can conditionally set an "edit" password, even an open or community known one, for groups that do not have a *.RecentChanges page (usually empty groups).
Unlink recently deleted pages: Some spambots follow links from *.*RecentChanges
pages, and by default, recently deleted pages have direct links to the edit form. This deactivates such links.
Variants of these filters have been used on pmwiki.org for several years.
Installation
To set the two hidden honeypot fields, edit the wikipage Site.EditForm and insert the following line before (:input end:)
(:input hidden code1 7264:)%comment%Enter code: (:input text code2:)%% |
If your skin uses a different edit form, you should obviously edit the skin's edit form. Do this before enabling the config.php code below.
Place this near the beginning of your file local/config.php or local/farmconfig.php:
## if an edit form is posted if ($action == 'edit' && preg_grep('/^post/', array_keys(@$_POST)) ) { $tmp_csum = trim(@$_POST['csum']); $tmp_csum = preg_replace('/[$`]\\w+/', '', $tmp_csum); # allow $Vars and `Text ## honeypot fields if (@$_REQUEST['code1']!='7264' || @$_REQUEST['code2'] > ''){ $WhyBlockedFmt[] = 'Invalid code entered'; } ## edit summary doesn't look like language elseif ($tmp_csum && ( preg_match("/^\\w*([a-z]+[A-Z]{2,})\\w*$/", $tmp_csum) || preg_match("/[bcdfghjklmnpqrstvwxz]{5,}/i", $tmp_csum) ) ) { $WhyBlockedFmt[] = 'Invalid "edit summary" entered, please select a different one.'; } ## raw HTML anchors elseif(@$_POST['text']>'' && preg_match("/(<|[<])a +href=/i", MarkupEscape($_POST['text']))) { $WhyBlockedFmt[] = 'HTML code needs to be escaped with [=code=] or [@code@].'; } ## if some of the above filters activated, block the post if(count($WhyBlockedFmt)) { $EnablePost = 0; $IsBlocked = 1; } }
If you want to install the Post size filter, add to the same file the following:
function PageTextSize($pagename, $page, $new) { global $EnablePost, $IsBlocked, $WhyBlockedFmt, $MessagesFmt; if (!$EnablePost) return; $L1 = strlen($new['text']); $L0 = strlen($page['text']); if(!$L0) return; # page is new or was empty if ( $L1/$L0 < .5 && $L0-$L1>200) { # more than half AND more than 200 characters removed $EnablePost = 0; $IsBlocked = 1; $WhyBlockedFmt[] = $MessagesFmt[] = 'You tried to remove a large part of the page content.'; } } ## to enable it on all pages, remove the # before the next line # array_unshift($EditFunctions, 'PageTextSize'); ## we enable it on selected pages only if(preg_match('/^PmWiki\\.(Questions|PmWikiUsers)$/', $pagename)) array_unshift($EditFunctions, 'PageTextSize');
If you want to install the Empty groups filter, add near the end of config.php:
if ($action=='edit' && ! PageExists( preg_replace('/[\\/\\.].*$/', '.RecentChanges', $pagename))) $DefaultPasswords['edit'] = pmcrypt('PASSWORD');
To unlink recently deleted pages, add this to the bottom of local/config.php:
# spambots abuse recently deleted pages Cookbook:SpamFilters if(preg_match('/RecentChanges/', $pagename)) { $LinkPageCreateFmt = "<a class='createlinktext' rel='nofollow' title='\$LinkAlt' href='#{\$FullName}'>\$LinkText</a>"; }
Configuration
Usage
Notes
Change log / Release notes
- 20170619 - Added Backtick escape character to Summary filter.
- 20150412 - Added "Empty groups" filter
- 20121020 - first public release
See also
On pmwiki.org, we have also enabled UrlApprovals and Blocklist.
- PmWiki /
- Blocklist Blocking IP addresses, phrases, and expressions to counteract spam and vandalism.
- Security Resources for securing your PmWiki installation
- UrlApprovals Require approval of Url links
- Cookbook /
- OpenPass Set a global password which is openly displayed to reduce spam (Alpha)
- OpenPass-Talk Talk page for OpenPass.
- RecentChangesDeletion Allow authors to delete RecentChanges pages, there-by making it possible for authors to delete wiki groups.
- Security Security authentication and authorization methods and systems
- TrackChanges Ways to more easily detect and verify all recent edits
Contributors
- Recipe written and maintained by Petko (5ko [snail] 5ko [period] fr). The honeypot code was written by Pm.
- If this recipe helps you or saves you time, you can help support its continued development by ♥ .
Comments
See discussion at SpamFilters-Talk
User notes +2: If you use, used or reviewed this recipe, you can add your name. These statistics appear in the Cookbook listings and will help newcomers browsing through the wiki.