XML Sitemap: what it is and what it is for

0
62
XML Sitemap
XML Sitemap

An XML Sitemap (Extensible Markup Language) is a text file used to report all URLs of a website and can include extra information (metadata)

An XML Sitemap (Extensible Markup Language) is a text file used to mark all URL stands for Uniform Resource Locator. Colloquially called a web address, it is a reference to a web resource such as a site, page, or file  of a website and may include extra information (metadata) about each URL, with details on when they were last updated, how important they are, and if there are other versions of the URL created for other languages.

All of this is done to help search engines crawl a website more efficiently by allowing any changes to be passed on to them directly, including when a new page is added or an old one is removed.

A very important thing to remember is that there is no guarantee that an XML Sitemap will cause pages to be crawled and indexed by search engines.

This means that a sitemap increases the chances of indexing. Indexing is the phase in which the search engine collects, analyzes, and stores data to facilitate the rapid and accurate search for information …, in particular, if the navigation or the general internal link building strategy does not connect all the pages, but it is not certain that the search engines faithfully follow the commands written within it.

On the contrary, as is well understood, a website may not even have a sitemap in the main root, but rely exclusively on the crawler.

The Topic Of This Post

  • 1 What is an XML Sitemap and why it matters
  • 2 Why you need an XML sitemap
  • 3 XML Sitemap Example
  • 4 How to create a sitemap.xml
  • 5 Sitemaps in Google Search Console
  • 6 How to find errors in a sitemap
  • 7 XML Sitemap Size Limits
  • 8 What are sitemap videos
  • 9 What are HTML Sitemaps
  • 10 Conclusion

What is an XML Sitemap and why it matters

First, we talk about XML and sitemaps and why they are so important when using SEO to index a website on search engines.

The XML is a markup language that defines the syntax for encoding documents that can read both humans and computers.

You can think of an XML file as a kind of text file, but with a data structure that is added via tags.

A website map allows search engines like Google to find, crawl, and index website pages knowing that:

  1. Ideally, an XML Sitemap should be added to the website root directory. All URLs in the sitemap must come from the same host.
  2. Only the canonical version of all page URLs should be included, so pages should not redirect or return an error status.
  3. The maximum length of URLs is 2,048 characters.
  4. While it may seem possible to manipulate search engines into thinking the page content is frequently updated by declaring the tag daily, it is not recommended to do so.
  5. If the frequency and priority tags don’t reflect reality, search engine crawlers are likely to ignore them.
  6. All URLs in the sitemap must come from the same host.
  7. You may need multiple XML Sitemaps if you have more than 50,000 URLs on a site or if a Sitemap exceeds 10Mb. If this is the case, then you will be prompted to create another XML Sitemap.
  8. You have the option to reduce the bandwidth requirement by compressing the sitemap file using gzip, but you need to make sure that, after unzipping the file, the size still doesn’t exceed 10Mb.

Why do you need an XML sitemap?

The main reason you should create and publish your XML sitemap in the main root of your site is indexing. While search engines can still technically find pages without a sitemap, adding a sitemap makes it much easier for them to crawl.

You may have orphaned pages (pages that have been left out of your internal linking), or that are harder to find. Your sitemap is especially important when adding pages or creating a whole new site that doesn’t have many if any, links yet.

Sitemaps also help search engines crawl your pages more intelligently by considering tags, adjusting their crawl rate accordingly, and allowing search spiders to be more proactive in getting your pages to visit.

Increasing a page’s priority level makes it more likely that pages will be crawled and indexed more frequently and before other less important parts of the site itself.

If you have a geo-targeted international site or a site that has the same page translated into multiple languages, you can use your XML sitemap to your advantage, because putting the href lang tags in your sitemap tells crawlers that you have multiple versions of the same. page.

Search engines can use this information to ensure they are presenting the right version to users based on language and/or location.

XML Sitemap example

Let’s now take a look at a real XML sitemap example to parse.

Our site is built using WordPress, and the sitemap.xml is generated using the Rankmath.com plugin

Within this XML file, you find the main index file which hosts all the sitemaps for the pages and posts.

There are many WordPress plugins for creating sitemaps, and you may often see a few other columns of data in the sitemap as well:

  • lastmod is the data this page was last modified for.
  • changefreq represents how often the page is expected to be updated with new content. This value gives search engines an estimate of how often they should crawl the page (although, it doesn’t mean it will be followed).
  • urlset xms encapsulates the XML file and references the current XML version in use.

Depending on your website structure, you may find other folder structures that can be created such as products, services, and so on.

How to create a sitemap.xml

Sitemaps are normally part of the realm of technical SEO, but you don’t need to be an expert to start creating one.

If you are using a CMS A Content Management System, often abbreviated to CMS, is software that helps users create, manage and modify the contents of a site … popular as WordPress, many plugins create and manage the XML sitemap for you.

YoastSEO is certainly one of the most used, especially for those who are not SEO savvy, as there is very little configuration and not much to do in terms of managing the sitemap – other than keeping an eye on it from time to time for clear any problems or update the plugin version.

If you are on a custom-built site, you can also use both free and paid online tools that scan the entire site and allow you to export a sitemap.xml file to upload to your domain.

Tools to generate XML sitemaps

  • Screaming Frog SEO Spider and Sitemap generator
  • Marion phpSitemapsNG
  • XML-Sitemaps
  • Perl Sitemap Generator One
  • Simple Sitemaps One
  • Free Sitemap Generator One

CMS plugin to generate XML Sitemaps

  • XML Sitemap – Drupal
  • XML Sitemap – OS Commerce Three
  • XML Sitemap – WordPressOne
  • XML Sitemap – Joomla

Sitemap schemes and validation tools

The XML Schema (XSD) for Sitemap 0.9 and supported Sitemap Extensions give you the elements and attributes that need to be included in your XML Sitemaps. The schemes (depending on the sitemaps, the sitemap index files, and the different file types supported by the sitemaps) are as follows:

  • For sitemaps
  • For Sitemap Index files
  • For videos
  • For images
  • For mobile devices 
  • For the news

Once you’ve created your sitemaps with all the right elements and attributes, validate them with one of the following tools:

  • XML sitemap
  • XChecker

Sitemap in Google Search Console

The Sitemap section within the Google Search Console allows you to track all your sitemaps from one place, providing a summary of all the maps that have been submitted through the account.

This includes a snapshot of data, including the type of sitemap, the dates it was most recently processed, any issues that were identified, and the number of pages submitted/indexed per individual sitemap and as a whole.

To test your sitemap before submitting it to the Google Search Console, click the red Add / Test Sitemap button on the right, then enter the URL of the sitemap you would like to test, verify its validity, and then submit it to Google for crawling.

You can submit your sitemap using Google Search Console and check how many of your submitted pages have been indexed by Google.

NOTE: This feature is also used to submit your sitemap to Google. Another method you should use to tell search engines about your sitemap is to add the sitemap URL inside your robots.txt file with the following wording:

Sitemap: httpHypertext Transfer Protocol (HTTP) is the foundation of the World Wide Web, and is used to load web pages using hyperlinks. HTTP is a protocol …: //example.com/sitemap.xml

How to find errors in a sitemap

As we have already said it may not be possible to get indexing of an entire site, even if we have created a perfect sitemap.

It is possible, however, to discover your site’s indexing issues while having a flawless sitemap.

To do this, analyze any errors in the sitemap section in both the Google Search Console and Bing Webmaster Tools, checking which pages are indexed against the URLs you submitted and if there is a large difference in this ratio or a sudden increase or decrease in these numbers.

These two reports can reveal other problems on your site, such as problems with the robots.txt file, duplicate content, and so on.

Many tools can be used to import and scan all the pages referenced in your sitemaps (such as Screaming Frog), allowing you to easily spot any unnecessary problems or redirects.

Sitemap content overview

It is also possible to test or resubmit sitemaps to search engines by simply clicking on the sitemap you wish to submit in Google Search Console or Bing Webmaster Tools by selecting the ‘Resubmit Sitemap’ or ‘Test Sitemap’ button.

XML Sitemap Size Limits

XML sitemaps are limited by size, both in the number of URLs, you can include and in the size of the file.

Sitemaps can only have 50,000 entries, with a maximum of 1,000 images and a maximum size of 10MB.

If you have a very large site that has many pages, images, and/or videos, you will need to create multiple sitemaps.

In this case, you will need to create a sitemap, known as the Sitemap Index File.

Here is the example of our sitemap index file which contains 12 different sitemaps:

What are sitemap videos?

Similar to the purpose of a standard sitemap, which displays the content that is on your website and allows crawlers from Google and other search engines to discover the content, sitemap videos contain metadata about all videos posted on your website.

This helps Google display your videos in search results where applicable. Video sitemaps include data about a video’s length, title, and a brief description of the video’s content.

The video content appears more and more in research to meet query.

How to create a video sitemap video

You can manually tag your videos and create a sitemap of the videos using a sitemap generator such as xml-sitemaps.com.

Furthermore, you can also use a plugin to create sitemaps depending on your CMS.

If you’re using WordPress, both Yoast and Jetpack are great automated ways to enable video sitemaps on your website.

There are several things you can tell search engines about your page’s video assets in sitemaps:

  • The URL pointing to the video player. If your video is embedded on your page, like from YouTube or Vimeo, you can use this tag. Normally you can find this URL in the video embed code.
  • The length of the video in minutes, between 0 and 28800 (8 hours). This isn’t technically required, but Google recommends it.
  • Include this information only if the video will not be available after a certain date. If you use it, but the dates in the format YYYY-MM-DD, and the times in the format Thh: mm: ss: TZD.
  • The evaluation of the video. Only values ​​between 0.0 and 5.0 are valid.
  • The number of times the video has been viewed.
  • The date the video was posted, not the date you put it on your site.
  • If No, your video will only appear in search results when the user disables SafeSearch. Otherwise, put Yes.
  • A very brief description of the key concepts related to your video. Create a separate element for each tag you use, up to 32 tags.
  • The broad topic your video covers, such as SEO, Digital Marketing is the component of marketing that uses the internet and online digital technologies to analyze the market, develop strategies, and put them into … or Advertising (abbreviated ADV) means advertising, it is a paid message that a company sends to inform or influence the people who receive it…
  • A list of countries in which the video cannot be played, or a list of only the countries in which users can access the video, depending on whether you have set the relationship as allow or deny. The list is delimited by spaces and uses ISO 3166 country codes. If you don’t use this tag, your video is assumed to be available everywhere.
  • The URL where you can find the collection your video appears in if there is one. Each video can have only one gallery_loc tag. If your gallery has a title you can add the title attribute.
  • The price is the amount of money required for a product or service. In a broad sense, the price is the sum of all …to download the video. The currency = attribute is required and uses the ISO 4217 currency code. Add the optional type = attribute to specify whether the download is to be bought or rented, and resolution = to specify whether the video is in HD or SD. You can use it multiple times for each currency you accept.
  • Allowed values ​​are yes and no to indicate whether or not a subscription is required to view the video.
  • If your video is embedded from another video site, but the hostname here. This URL must be the same domain as the tag.
  • The platforms, web, mobile, and TV, from where the video may or may not be accessible. The relationship = attribute defines whether the list is inclusive or exclusive. You can only have one platform tag per video.
  • Whether the video is a live stream or not. Only yes or no is valid.

What are HTML Sitemaps

Another type of sitemap is an HTML sitemap.

The purpose of the HTML sitemap is to allow users to easily navigate and find the pages on your website.

Like an XML sitemap, the HTML sitemap lists all the pages on your website that you want to be indexed in Google.

While these are built for users and not bots, HTML sitemaps are also very useful for Google bots to crawl and understand your website pages.

Conclusion

When done right, XML sitemaps help search engines find, crawl, and index websites quickly. Make sure you have correctly formatted, compressed, and submitted your XML sitemap to search engines to get the maximum benefit:

You no longer need to rely on links to scroll your insurance site pages, because search engines will see new or updated sites and pages faster.

Bots can crawl pages more intelligently thanks to the meta-information available in sitemaps.

LEAVE A REPLY

Please enter your comment!
Please enter your name here