Your site's sitemap.xml lists all of the published, public pages on your site. It is available at [site.url]/sitemap.xml. For example, the sitemap of this site can be found at https://docs.gossinteractive.com/sitemap.xml
Our sitemaps conform to the standard sitemap protocol (opens new window) expected by search engines like Google (see Google's support page at https://support.google.com/webmasters/answer/183668). They are not styled and appear as an XML document.
Which Articles?
The sitemap is built from the platform's article search collection. It includes all of the live non-secured articles on your site that have been indexed by the site search. Articles beneath "hidden" articles are included, as are articles that are otherwise hidden from the main content structure and navigation.
Just to stress again, the sitemap will include links to every single article, unless an article is in a secured area, is excluded from the search with a -10 search boost (see Article Search Optimisation) or has display properties set which would return a "404 page not found" result.
Excluding Articles
Articles can also be excluded by relating metadata to them. The standard metadata group (found in the goss/sitemap group of the metadata library) includes an "exclude" property with two values.
- article excludes the article it is related to
- article_and_descendants excludes the article it is related to and all articles beneath it in the content tree. Note that because the sitemap is built from the platform's search collection, this value won't work if the article is itself excluded from your search (for example, it's switched off)
Robots Metadata
If you are excluding articles from your sitemap you might also want to exclude them from search crawlers using "noindex" metadata. See "Adding Additional Properties" in Metadata Properties, Unfurling and Structured Data Markup for how to do that.
XML Tags
The standard sitemap includes the compulsory
<url>
<loc>https://docs.gossinteractive.com/article/5773/Resources</loc>
<lastmod>2019-01-29</lastmod>
</url>
The optional
For example, adding these two values to an article:
would change the XML above to:
<url>
<loc>https://docs.gossinteractive.com/article/5773/Resources</loc>
<lastmod>2019-01-29</lastmod>
<priority>0.5</priority>
<changefreq>weekly</changefreq>
</url>
Whether search engines and crawlers respect these values is outside of our control.
How It Works
The sitemap.xml is generated using a SOLR query. The root article of the current subsite is taken as the starting point.
The three metadata properties in the sitemap metadata group have search fields set as:
- SITEMAPXMLPRIORITY
- SITEMAPXMLCHANGEFREQ
- SITEMAPXMLEXCLUDE
When metadata values from these properties are related to an article, the search index entry of that article holds those values in its dynamic "OBJECT_SF_*" fields. It is these search fields that identify the metadata values as "special" and part of the sitemap configuration.
The
Example Metadata
The standard sitemap metadata should already be present in your iCM metadata library (found in the goss/sitemap group). If it isn't you can download and import (ZIP) [1KB].