Recent Changes - Search:

Cookbook

PmWiki

pmwiki.org

EnableHTML

<< BuildForms | Forms-related | Forms >>

Summary: How to include HTML markup in wiki pages
Version:
Prerequisites:
Status:
Maintainer:
Categories: Forms, Markup

Question

Is it possible to include HTML markup in wiki pages?

Answer

By default (and by design), PmWiki does not support the use of HTML elements in the editable markup for wiki pages. There are a number of reasons for this described in the PmWikiPhilosophy and PmWiki.Audiences. Basically, Pm feels that enabling HTML markup within wiki pages in a collaborative environment has the effect of excluding some potential authors from being able to edit pages, as well as posing a number of display and security issues. Indeed, for complex markup sequences it's often better to design a CustomMarkup to provide the desired functionality, as this allows things to be easily customized and changed in the future without having to re-edit a lot of pages.

There's also a security issue involved with that. If visitors can add elements such as <script> and <meta>, or insert CSS styles for positioning or coloring text, this may have unwanted side-effects and even pose threats to browser and user security. See http://www.cert.org/advisories/CA-2000-02.html for additional information on the types of risks this may pose.

However, there are a number of administrative pages where only selected people have edit passwords. It's entirely reasonable to unlock as much power as can be trusted to them, and as much complexity as they are willing to handle.

So for those cases where this is warranted, here is how to do it:

Install enablehtml.phpΔ into your cookbook directory.

Edit your local/config.php and add the following line:

  include_once("$FarmD/cookbook/enablehtml.php");

This will give you a new function EnableHtml($tag) for your config.php that will make PmWiki allow through any HTML tags that are listed in $tag. Examples:

  
  EnableHtml('img');
  EnableHtml('b|i|u|sup|sub|a|iframe|small');
  

You can also allow HTML comment tags via

  
  EnableHtml('!');
  

Here's the code for it if for any reason the download-link does not work. (Just put it in your cookbook/ as a text file of your own, naming the file enablehtml.php, and carry on with the instructions below):

<?php if (!defined('PmWiki')) exit();
function EnableHtml($tag) {
Markup(
"html-$tag",
'>{$var}',
'/&lt;(\/?('.$tag.')(?![a-z!])(([\'"]).*?\4|.*?)*?)&gt;/ie',
'Keep(PSS(\'<$1>\'))');
}

HTH - it's the best I can do -- TeganDowling

As an option, here's some text combining info from two other pages. I like it because you can enable it sitewide, and anyone who is not an admin will have their html disabled when they save. Easy to use! Caveman

Markup('html', 'fulltext', '/\\(:html:\\)(.*?)\\(:htmlend:\\)/esi',
  "'<:block>'.Keep(str_replace(array('&gt;','&lt;','&amp;'), array('>','<','&'), PSS('$1')))");

array_unshift($EditFunctions, 'MaybeDisableHtml');
function MaybeDisableHtml($pagename,&$page,&$new)
{ if (!CondAuth($pagename,"admin"))
  { $ROSPatterns["/\\(:html:\\)/i"] = "[:html:]";
    $ROSPatterns["/\\(:htmlend:\\)/i"] = "[:htmlend:]";
  }
}

Sorry, but this doesn't work! cg 2008-08-28

It works for me, but I use the following:

Markup('html', 'fulltext', '/\\(:html:\\)(.*?)\\(:htmlend:\\)/esi',
  "'<:block>'.Keep(str_replace(array('&gt;','&lt;','&amp;'), array('>','<','&'), PSS('$1')))");

if (!CondAuth($pagename,"admin"))
  { $ROSPatterns["/\\(:html:\\)/i"] = "[:html:]";
    $ROSPatterns["/\\(:htmlend:\\)/i"] = "[:htmlend:]";
  }

Which version you use depends on where you put the condition in your local/config.php or local/farmconfig.php file.

See this mailing list discussion for more details.

Matt L? August 31, 2008, at 09:27 AM

Details

This does not validate element attributes.

The style attribute is particularly dangerous, because it allows positioning an HTML element anywhere on the screen; a fraudulent visitor could (with some effort) write HTML that overlays all the navigational elements of the wiki. Later visitors would then see an unchanged page where every link leads to a place that the malicious visitor has determined, instead of triggering the functions that your wiki normally offers.

It also doesn't validate whether opening and closing tags are properly nested. If the pages contains invalid HTML markup, it will simply be passed through unchanged.
This offers an inroad to another kind of abuse: if the <div> tag is allowed, a vandalic visitor could add a "sufficient" number of </div> tags, and would disturb the display of all elements that come after the wiki text (this will affect the footer, and the wikitext itself if the vandal edits the side bar).

(In case I haven't said this already, this means: Don't allow this on pages that aren't password-protected!)

The parameter of EnableHtml is actually a regular expression that describes the HTML tags that you wish to match. If you don't know regular expressions, here's a short list of things that you can do:

  • Say EnableHtml('b|i|u|sup|sub'); to match any of the b, i, u, sup and sub tags.
  • Say EnableHtml('script'); to allow editors to insert JavaScript on the page. Note that this is dangerous: many browsers have security issues with JavaScript, and visitors may be tempted to leave malware on your page (in the worst case, this may lead to legal liabilities for the site administrator if he cannot say who placed the malware on his site). In other words, this is a solution for the case where PmWiki isn't used as a collaboration platform, but as a publishing tool with a small group of known and trusted editors, or if you have a registration process and can find out who changed what.
  • Say [a-z!]+ to match all HTML tags (including comments). This opens the road for online HTML editors like FCKEdit. The security issues are the same as when enabling <script> tags.

If EnableHtml enables an HTML tag <foo>, it will also automatically enable the corresponding ending tag </foo>. There is no need to write something like

  EnableHtml('b|/b');

(in fact that would not work because / is a reserved character in a regular expression, but even if you rewrote that expression, this would get you a pass-through for <b>, </b>, and <//b>, which isn't particularly useful).

EnableHtml will copy through any attributes that it finds within a tag. So EnableHtml('form') will also allow through

  <form action="http://domain.tld/path/to/cgi?query=value" method="POST">

EnableHtml tries to be smart about attribute values. It will correctly handle cases like

  <input title="Tell me more about the < b > tag in HTML">

and not replace that < b > with anything else, or take the > in < b > as the end of the < form > tag.

If EnableHtml finds a HTML tag, only the {$some_variable} markup will be recognised between the angle brackes, nothing else. That is,

  <input title="Tell me ''more'' about you">

will not make PmWiki emit something like

  <input title="Tell me <i>more</i> about you">

EnableHtml does no rigorous validity checking: full checking is beyond the capabilities of a recipe, so it doesn't even try.
As a result, it will accept some forms that aren't valid HTML, such as <form method="Post"/> or </b/>.

To be precise, EnableHtml will accept any sequence of characters that

  • starts with the left angle bracket ("less-than character") < and an optional slash /,
  • continues with a token that matches any pattern ever defined with EnableHtml('...');,
  • continues with an arbitrary sequence of characters, and
  • finishes with an optional slash / and a right angle bracket ("greater-than character") > outside of strings.

See Also

Bugs and Comments

Be VERY careful when trying to use this with pages that already have hanging indents -- like PmWiki / BasicEditing. Hanging indents use -< as their indicator. If the next token looks like an html tag you want (eg. the phrase "A reverse arrow" will look like an anchor tag (<a>) ), then you may find pmwiki.php going into an infinite loop. --oPEO


It doesn't seem to work.

I am not the first one to notice. Here is another unanswered question:

(Jan 3, 2007) I placed the enablehtml.php file into my cookbook directory in the hopes of allowing me to make new pages using HTML. Then I put include_once("$FarmD/cookbook/enablehtml.php"); in my local/config.php. Then beneath the "include_once..." line I put EnableHtml([a-z!]+) so that it enabled all HTML tags but upon doing so it made my page unviewable. Any help? Thanks a lot, Happy New Year.

Like this user, I followed the directions to the letter. I tried many variations, nothing worked. Maybe it worked with previous versions, but not anymore?


I'd like to point out a bug: using block elements may result in invalid XHTML. In HTML it's illegal to have <p><div>foo</p><p>bar</div></p>, but if one were to write:

<div>
foo

bar
</div>

it will come out poorly when PMWiki auto adds in p-tags. :/ --Carl


To create a HTML-only site, you'd have to disable PmWiki's own markup. This isn't advisable for the pages that come with the installation of PmWiki, but it's perfectly possible for the other pages.

Here's a sketch how to do that (warning: PHP ahead):

In config.php, you'd first check that the page isn't going to be served from wikilib.d (if it's from there, it's the unmodified version from the PmWiki installation page and will most likely contain PmWiki markup, and you don't want to break this). If the page exists in wiki.d, somebody edited or created it, so disabling PmWiki markup is OK; also, if the page is new (neither in wiki.d nor in wikilib.d), you can also disable PmWiki markup.

Disabling the markup can be done in several ways, none of them fully satisfactory:

  1. Call DisableMarkup('rule1', 'rule2', ...) for each rule defined in PmWiki. (The rule names can be picked off the existing Markup(...) calls in PmWiki, or by setting $EnableDiag = 1 in config.php and calling any PmWiki page with action=ruleset.) Do not remove actions with a name that starts with an underscore, and don't remove the 'restore' action (EnableHTML needs it).
    The downside here is that you'll have to review the list of rules with every PmWiki update, and that list is quite long.
  2. Get keys($MarkupRules);, then walk that array and call DisableMarkup(...); for each entry in the keys array that doesn't start with an underscore and isn't 'restore'. (Do not walk the $MarkupRules array directly: walking an array while deleting entries from it will most likely lead to nasty bugs.)
    This will break if PmWiki ever changes the way that markup rules are stored (not a very likely event, but anyway).

Joachim Durchholz April 20, 2005, at 11:42 AM


I'd like to have not just tags but also attributes filtered. That way, EnableHTML could be used to allow harmless HTML like < b > without (at the same time) doing potentially dangerous stuff like < b style="some nasty CSS here" >.

It would be even better to filter attributes and CSS settings, so that harmless CSS styles could be done but dangerous ones (such as margin with negative values and similar stuff that may break the layout) would be filtered.

This all is technically a bit difficult. Any serious effort at this should also consider the %...% markup that allows to set CSS styles directly - either it filters out dangerous CSS (in which case EnableHTML should reuse that code), or it doesn't (in which case a solution for both should be sought).

Joachim Durchholz April 20, 2005, at 11:42 AM


I'd like to add the code for a webring to my page; it has to be in HTML. How do I do that?

Use CustomMarkup. There's a specific example on that page. -- Susan

I'm having trouble using CustomMarkup. I want to be able to include a Statcounter.com javascript in my pages but when I try to use the Markup() code in stdmarkup.php (or config.php), it returns errors when I go to my wiki(approve links) (even using the example script) -Craig


You can enable some html tags without any attributes. Put the following into local/config.php

$AllowedHtmlTags = 'b|i';
Markup("html-tags",
  '>{$var}',
  '/&lt;('.$AllowedHtmlTags.')&gt;(.*?)&lt;\/('.$AllowedHtmlTags.')&gt;/',
  '<$1>$2</$3>');

Here's an alternative method.

If you place this in your configuration file

Markup(
  'html',
  'fulltext',
  '/\\(:html:\\)(.*?)\\(:htmlend:\\)/esi',
  "'<:block>'.Keep(str_replace(array('&gt;', '&lt;', '&amp;'),
  array('>', '<', '&'), PSS('$1')))");

you'll be able to use (:html:) and (:htmlend:) directives to insert HTML markup in your wiki pages. If you don't want your HTML source to be inside a <p>paragraph</p> then the (:html:) directive should be at the beginning of a line, immediately preceding your raw HTML source. The (:htmlend:) directive belongs at the end of the last line of your HTML source. For example, this wiki source

Line above.
(:html:)<h2>Hi!</h2>
Some text.(:htmlend:)
Line below.

results in this output

<p>Line above.
</p><h2>Hi!</h2>
Some text.
<p>Line below.
</p>

You can also insert HTML code inline within a paragraph. For example, this wiki source

Text...(:html:)<span class='foo'>Hi!</span>(:htmlend:)...text.

results in this output

<p>Text...<span class='foo'>Hi!</span>...text.
</p>

Authors will be able to place any conceivable malicious code between (:html:) and (:htmlend:), so be sure to take appropriate precaution...

--Hagan (Thank you Hagan!!)

... and here's a good precaution to take: In your configuration file, put

array_unshift($EditFunctions, 'MaybeDisableEmbedhtml');
function MaybeDisableEmbedhtml($pagename,&$page,&$new)
{ if (!CondAuth($pagename,"admin"))
  { $ROSPatterns["/\\(:html:\\)/i"] = "[:html:]";
    $ROSPatterns["/\\(:htmlend:\\)/i"] = "[:htmlend:]";
  }
}

How this works: if someone is editing a page who doesn't have admin privileges, then it will strip all (:html:) and (:htmlend:) tags from their text, replacing them with square-bracket versions that do nothing. --Lucian Wischik, 23 November 2006, based on suggestions by PM

(The above solution also answers the following question:) "I'd like to restrict HTML tags to write-protected pages. How can I do?"
The idea is: My Web site is public. I do not want to allow people adding HTML tags in normal pages. But, if the page is protected by a password (write-protect), I am nearly sure that the page will not be modified by a hacker. And if I want to include something in a lot of pages, I can use the site/group/page header or footer.

Jean-Dom, 27 September 2005.


I'm confused. How do you rig it so that only admins, or certain users, or certain groups, may have permission to insert HTML?

Same here. Can somebody post a decent instruction on how to implement the html and htmlend tags safely? I only see snippets of code above. Where does one put what?

<<<<<<< Whoa! Trying to use EnableHtml made everything go blank... Using SimpleTab skin, maybe that can have something to do with it...

-Tryggve ---

Contributors

  • Pm, 15-Mar-2004, original code
  • Pm, 02-Dec-2004, updated for PmWiki 2
  • Joachim Durchholz 11-Apr-2005, wrapped it up in a downloadable recipe, made it incremental, and made it not match end-of-tag within string-valued tag attributes.
  • Joachim Durchholz, 30-Apr-2005, now also handles HTML comments.
  • Susan, 22-Dec-2005, specific example of WebRing code added (and fixed half the page showing in bold).

Sandbox

Use the space below to experiment with embedded HTML tags.

<b>bold</b> produces <b>bold</b>.

<form action='$ScriptUrl/$[Main/SearchWiki]' style='float: right;
margin-bottom: ; margin-top: -3px;'>
    		<input type='hidden' name='n' value='$[Main/SearchWiki]' />
    <input class='searchbox' type='text' name='q' value='search in
$WikiTitle' />
    <input class='searchbutton' type='submit' value='$[Go]' />

<a hef="http://www.pmwiki.org"
target="_self">www.pmwiki.org</a>

<form action='$ScriptUrl/Main/SearchWiki' style='float: right; margin-bottom: ; margin-top: -3px;'>

    		<input type='hidden' name='n' value='Main/SearchWiki' />
    <input class='searchbox' type='text' name='q' value='search in $WikiTitle' />
    <input class='searchbutton' type='submit' value='Go' />

<a hef="http://www.pmwiki.org" target="_self">www.pmwiki.org</a>

<SCRIPT LANGUAGE="JavaScript"> user = 'name'; site = 'domain.com'; document.write('<a href=\"mailto:' + user + '@' + site + '\">'); document.write(user + '@' + site + '</a>'); </SCRIPT>

Edit - History - Print - Recent Changes - Search
Page last modified on November 04, 2009, at 08:05 PM