XML Sitemap for Large Sites
In my previous two articles (How to Generate XML Sitemap for Your Website and Submit XML Sitemap to Google, Yahoo and MSN), I explained how you can generate XML sitemaps and submit them to search engines.
What if your website has a large number of pages? How do you create and submit XML sitemaps for large websites (e.g., e-commerce, publishing, price comparison sites)?
An XML sitemap file can contain up to 50,000 URLs and should be smaller than 10 MB. So, if your site is larger than these limits, you need to break your sitemap into several smaller XML sitemaps. You then need to list all the sitemaps in a sitemap index file. Please note that XML sitemap index file cannot contain more than 1,000 XML sitemaps.
The XML sitemap index file has a similar format to that of regular XML Sitemap file. The Sitemap index file uses the following XML tags:
sitemapindex: the parent tag surrounds the file and contains the individual sitemaps
sitemap: the individual XML Sitemap
loc: the location of the sitemap
lastmod: the last modified date of the sitemap
Sample XML Sitemap Index
Here is an example of how an XML Sitemap index file looks like:
< ?xml version=”1.0″ encoding=”UTF-8″ ?>
– < xmlns=”http://www.sitemaps.org/schemas/sitemap/0.9″>
– < sitemap>
< loc>http://www.yoursite.com/sitemap.xml< /loc>
< lastmod>2008-12-06T13:51:06+00:00< /lastmod>
< /sitemap>
– < sitemap>
< loc>http://www.yoursite.com/sitemap2.xml< /loc>
< lastmod>2008-12-06T13:51:06+00:00< /lastmod>
< /sitemap>
– < sitemap>
< loc>http://www.yoursite.com/sitemap3.xml< /loc>
< lastmod>2008-12-06T13:51:06+00:00< /lastmod>
< /sitemap>
< /sitemapindex>
You can write a script that helps you automate the XML sitemap generation and updates. The script can append new pages to the XML sitemap as they are created. However, you need to set up a rule to make sure that your sitemap does not go over its maximum size. Once a sitemap is reached its quota, a new one should be created and appended to the XML sitemap index file.
Nima Asrar Haghighi, SEO Expert
Filed Under: Search Engine Optimization • SEO