GoogleSitemaps-Talk
This space is for User-contributed commentary and notes. Please include your name and a date along with your comment.
Comments
Here is a new version that includes changefreq, relatively good priority rater/grader based on four aspects of a page, also grades can be changed (you will have to write inside functions, be careful), there is also a priority bypass where you can set manually the priority for the pages you want.
I am so used to think I'll have lot's of work to create something I need normally, that you start to think the same while using PmWiki, BUT it is not the truth. I did something like this for my goole sitemapINDEX and sitemap. Sitemap maps pages per Group, while sitemapindex maps groups per one page (RecentChanges). With sitemapindex you get a list of groups with an 'action=sitemap' attached to it, so you can fetch in a new feed all pages inside it.
to fetch the results for sitemapindex not using a trail, do this:
http://your-site/?group=*&name=RecentChanges&action=sitemapindex
Oh joy !
Here is the snippet:
---8x--- ## Examples taken from blogger sitemap structure # you can configure it further with pmwiki feed features # like : group, name, list, count ... ## Sitemapindex 0.9 settings for ?action=sitemapindex SDVA($FeedFmt['sitemapindex']['feed'], array( '_header' => 'Content-type: text/xml; charset="$Charset"', '_start' => "<?xml version='1.0' encoding='UTF-8'?>\n". '<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">'."\n", '_end' => "\n</sitemapindex>\n", )); SDVA($FeedFmt['sitemapindex']['item'], array( '_start' => "<sitemap>\n", '_end' => "</sitemap>\n", 'loc' => ($EnablePathInfo == 1) ? '{$ScriptUrl}/?group={$Group}&action=sitemap' : '{$ScriptUrl}?group={$Group}&action=sitemap' )); ## Sitemap 0.9 settings for ?action=sitemap SDVA($FeedFmt['sitemap']['feed'], array( '_header' => 'Content-type: text/xml; charset="$Charset"', '_start' => "<?xml version='1.0' encoding='UTF-8'?>\n". '<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">'."\n", '_end' => "</urlset>\n", )); SDVA($FeedFmt['sitemap']['item'], array( '_start' => "<url>\n", '_end' => "</url>\n", 'loc' => '{$PageUrl}', 'lastmod' => '$ItemISOTime', )); ---8x---
Simple...
CarlosAB April 30, 2018, at 07:16 PM
- Other Site Maps exist, like the extension module for FireFox, which offer the opportunity to have a Navigation bar… Should not your module be called GoogleSiteMap.php, with action=googlesitemap?
- encoding is UTF-8, while pmwiki.php uses ISO-8859-1. For the
"é"
, I have%e9
which is, I think, ISO and not UTF. Probably should you transcode the characters. - The time modification is not reported yet. Have you tested with the full time format (for me, with +02:00)?
- For the priority, maybe we could increase it for the home page, and reduce it for the Recent Changes ones.
- Note that the priority and the changefrequency are not mandatory. If the priority is always the same, I suggest not to write it in the file.
- Could the priority be set by a page text variable? That is, pages without the variable would have a default priority, but page authors could mark pages higher or lower by setting the PTV.
Ben Stallings December 11, 2007, at 03:02 PM - I'm using clean URLs recipe and have encountered a problem.
?action=sitemap
works, as it is supposed to (pages are displayed as http://wiki.spounison.org/Main/Homepage); But in the sitemap.xml.gz URLs of pages are displayed as?n=***
(e.g. http://wiki.spounison.org/pmwiki.php?n=Main.Homepage). That's the problem of mine...
ArSoron March 05, 2008, at 03:14 AM
$ScriptUrl
= 'http://wiki.spounison.org/';
Umang
- I had a problem: 403 forbidden problem due to modifications since 2.1.beta8 see ControllingWebRobots. The
$RobotActions
has to be completed withaction=sitemap
to make pmwiki.php?n=Site.AllRecentChanges&action=sitemap accepted by google. Ref. inrobots.php: SDVA(
$RobotActions
, array('browse' => 1, 'rss' => 1, 'dc' => 1));
Damien July 08, 2008, at 05:03 AM
pmwiki/scripts/robots.php
from
SDVA($RobotActions, array('browse' => 1, 'rss' => 1, 'dc' => 1));
to
SDVA($RobotActions, array('browse' => 1, 'rss' => 1, 'dc' => 1, 'sitemap'=>1));
Umang
Thank you so much for this. Trying to get an accurate sitemap for the PmWiki part of my site using Google's sitemap generator has been driving me up the freaking wall. Bing doesn't have a problem correctly indexing PmWiki with clean URLs, but no matter what I do, Google refuses to recognize the last modified date and is seemingly random about indexing the PmWiki pages or the old html pages that now redirect to the wiki.
Sitemaps are not required to be in the webroot directory. Using either a sitemap index (if you have more than one sitemap), or when you submit the sitemap you can define where the sitemap is located.
Don't forget to put a sitemap entry in your robots.txt file. It should be the very first entry and look like this:
User-agent: * Sitemap: [=http://www.yoursite.com/sitemap.xml=]
Google only recently explained that the higher the number the greater the priority. If you have a few key pages you'd like to prioritize, or even use Google's software to figure that out (just written to another file), then you can manually tweak a few entries. I have few enough entries to create the file in plain .xml instead of .xml.gz, so it's not that big of a deal for me.
Again - thank you so much.
Jerod Poore Crazy Meds 25 September 2011
Older Comments
- Can the script be made to exclude password protected groups?
- For the frequency, I think you should write at least "hourly" for Recent Changes (Group or Main).
1] It's not clear how to generate the .gz sitemap. I have set $SitemapDelay=0
, made a wiki edit, and still I don't see the file. The XML is shown in browser correctly. I temporarily set the pmwiki directory to ALL write, with no sucess. (ref http://www.myurl.com/?action=sitemap). DaveG
2]same issues as DaveG here, ?action=sitemap
returns a working xml, but I'm struggeling to find out how to generate the .xml.gz file.. Gilrim
Here's my hack: adding a script on a linux or OS X system as a (daily? hourly?) cronjob. Say I make a bash script called "makesitemap" for each wiki on my system and put it in the webroot for the site.
#! /bin/bash curl -o sitemap.xml http://www.myurl.org/index.php?action=sitemap rm sitemap.xml.gz gzip sitemap.xml chmod 644 sitemap.xml.gz
I had to remove the old sitemap or the gzip command asks for overwrite verification
Now I just need a cronjob to run it. Most advanced cPanel type webhosts give you a user crontab. No, this won't work for everyone, but people worried about Google sitemaps are already getting a bit advanced :) XES
Okay ...I can't run bash on my server so I figured there had to be away of doing the same thing above with PHP ...so a gleamed the net and came-up with the following by hacking other peoples code ...cause I am not a programmer by any means... ARNOLD
<?php\\ $url = "[=http://http://www.myurl.org/index.php?action=sitemap=]";\\ $file = "sitemap.xml";\\ \\ $ch = curl_init ($url);\\ $fp = fopen ($file, "w") or\\ die("Unable to open $file for writing.\n");\\ \\ curl_setopt ($ch, CURLOPT_FILE, $fp);\\ curl_setopt ($ch, CURLOPT_FAILONERROR, true);\\ curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1);\\ \\ if (!curl_exec ($ch)) {\\ print("Unable to fetch $url.\n");\\ }\\ \\ curl_close ($ch);\\ fclose ($fp);\\ \\ function compress($srcName, $dstName)\\ {\\ $fp = fopen($srcName, "r");\\ $data = fread ($fp, filesize($srcName));\\ fclose($fp);\\ \\ $zp = gzopen($dstName, "w9");\\ gzwrite($zp, $data);\\ gzclose($zp);\\ }\\ \\ // Compress a file\\ compress("sitemap.xml", "sitemap.xml.gz");\\
I simply added this to my .htaccess because I've disallowed *action
in robots.txt:
RewriteRule ^sitemap\.xml$ ?action=sitemap [L]
Umang
Talk page for the GoogleSitemaps recipe (users?).