SaveThePage
Questions answered by this recipe
How can I save an entire web page to my wiki as a new article?
Description
Save an entire web page (the HTML) to a wiki page by using a bookmarklet to send the current page to your wiki for processing and editing..
Installation
Download savethepage.zipΔ to a safe place and unzip in your wiki root directory. Add the following to local/config.php
:
include_once("$FarmD
/cookbook/savethepage.php");
Configuration
The following variables can be set before the include statement above to configure the recipe:
$STP_PagePrefix
: whatever you'd like to insert before the wiki text. Default is nothing.$STP_PageSuffix
: whatever you'd like to insert afterwards. Default is nothing.$STP_NewPageNamePrefix
: Used for the page name if the page you are saving has no <title>$STP_PageFmt
: the template for the page. Variables that will be inserted are:$summary
: the page's meta description contents, if any$tags
: the page's meta keyword contents, if any$stp_url
: the url of the source page$title
: the contents of the <title> tag, if any$time
: time the page was grabbed$text
: converted text of the page's HTML into PmWiki markup
$STP_PageFmt
is:
"
(:linebreaks:)
Summary:\$summary
Tags:\$tags
Source:\$stp_url
Title: \$title
Saved:\$time
(:nolinebreaks:)
(:nolinkwikiwords:)
\$text
(:linkwikiwords:)
"
Usage
This recipe relies on establishing a bookmarklet that you can drag onto your bookmark toolbar (or copy and save someplace in your bookmark collection). To create it, you put the following on a page in the group you want to save pages to:
(:savethepage:)
When you've dragged or otherwise saved the bookmark, you can delete the markup.
Notes
This recipe will not work for web pages/sites that require authorization to retreive the page. It will also not work with pages where the content you want to capture is created via javascript.
HTML::WikiConverter
This is a perl script, available on CPAN, that is used to make the conversion between HTML and PmWiki. I tried using the Cookbook:ConvertHTML recipe, but it was not complete enough and left much of the orginal HTML intact.
A patch is required to make the PmWiki conversion work correctly in some cases (see Bug 79778):
In HTML::WikiConverter::PmWiki, make the following change:
152c152 < return $node->attr('src') || ''; --- > return ' ' . ($node->attr('src') || '') . ' ' ;
(There may be other problems with the converter as well.)
Change log / Release notes
- 2012-11-23.2b -- Initial Release
- 2012-12-03 -- Fix problem where old version of PHP 5.2.12 did not handle UTF-8 characters properly.
See also
- Cookbook:ConvertHTML (looked at and abandoned for converting HTML)
- Cookbook:AddLinkBookmarklet (inspiration for much of this)
Contributors
Comments
See discussion at SaveThePage-Talk?
User notes? : If you use, used or reviewed this recipe, you can add your name. These statistics appear in the Cookbook listings and will help newcomers browsing through the wiki.