What Is an XML Sitemap?
An XML sitemap (sitemap.xml) is a file that lists your site's pages in XML format. It communicates your site structure and page locations to search engine crawlers, enabling efficient crawling and indexing.
For large sites or new sites, crawlers may not discover all pages through internal links alone. A sitemap bridges this gap by explicitly listing every page you want indexed.
/* Basic sitemap.xml structure */
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://example.com/</loc>
<lastmod>2026-04-25</lastmod>
<changefreq>daily</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>https://example.com/about</loc>
<lastmod>2026-04-20</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
</url>
<url>
<loc>https://example.com/blog/article-1</loc>
<lastmod>2026-04-18</lastmod>
<changefreq>weekly</changefreq>
<priority>0.6</priority>
</url>
</urlset>Why Are Sitemaps Important?
Google uses sitemaps as an important signal to discover pages. New pages and pages with few inbound links are often discovered through sitemaps rather than crawling alone.
50,000
Maximum URLs per sitemap file
50MB
Maximum file size (uncompressed)
+26%
Average indexing rate increase after submission (large sites)
How to Write a Sitemap (XML Structure)
Sitemaps use standard XML format with the following elements. Only three tags are required; the rest are optional.
| Tag | Description | Required |
|---|---|---|
| <urlset> | Root element of the sitemap. Declares the sitemap protocol via the xmlns attribute. | Yes |
| <url> | Wraps each individual URL entry. Multiple <url> elements are placed inside <urlset>. | Yes |
| <loc> | The full URL of the page. Must be an absolute URL including https://. | Yes |
| <lastmod> | Last modification date of the page. Written in W3C Datetime format (YYYY-MM-DD). | No |
| <changefreq> | Hint about update frequency. Options: always/hourly/daily/weekly/monthly/yearly/never. Google treats this as advisory. | No |
| <priority> | Relative priority within the site, from 0.0 to 1.0. Google largely ignores this, so it's optional. | No |
For large sites exceeding 50,000 URLs, use a sitemap index file to manage multiple sitemaps.
/* Sitemap index example */
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>https://example.com/sitemap-pages.xml</loc>
<lastmod>2026-04-25</lastmod>
</sitemap>
<sitemap>
<loc>https://example.com/sitemap-blog.xml</loc>
<lastmod>2026-04-23</lastmod>
</sitemap>
</sitemapindex>Best Practices for Sitemap Setup
Place at root directory (/sitemap.xml)
Host your sitemap at the domain root (https://example.com/sitemap.xml). This is the standard location crawlers check first, and it keeps robots.txt references simple.
Add sitemap URL to robots.txt
Add 'Sitemap: https://example.com/sitemap.xml' at the end of your robots.txt. This allows non-Google crawlers (Bingbot, etc.) to discover your sitemap automatically.
Submit via Google Search Console
Submit your sitemap URL through the 'Sitemaps' section in Search Console. You can monitor submission status, errors, and indexed URL counts from the dashboard.
Exclude noindex and non-canonical URLs
Remove pages with noindex meta tags or canonical tags pointing elsewhere. Conflicting signals between your sitemap and on-page directives reduce crawl efficiency.
Set accurate lastmod dates
Use the actual last modification date for each page. Setting the same date for all pages or changing dates without real updates causes Google to lose trust in your lastmod values.
Common Mistakes to Avoid
Including non-existent URLs (404 pages)
URLs returning 404 in your sitemap signal low quality to Google. Regularly validate URLs in your sitemap and remove deleted pages.
Including noindex pages
A sitemap signals 'please index this,' while noindex says 'don't index.' This contradiction triggers a 'Submitted URL marked noindex' error in Search Console.
Setting the same lastmod date for all pages
Using identical lastmod values across all URLs causes Google to distrust your lastmod data and ignore it entirely. Only update dates when content actually changes.
Not updating the sitemap
New pages that aren't reflected in your sitemap may not be discovered by crawlers. Use CMS integrations or automated generation to keep your sitemap current.
Setup by CMS / Framework
WordPress(Yoast SEO / RankMath)
WordPress SEO plugins like Yoast SEO and RankMath automatically generate sitemaps. Simply activate the plugin to access your sitemap at /sitemap_index.xml. Sitemaps are automatically split by content type (posts, pages, categories) and updated when new content is published.
Next.js(app/sitemap.ts)
In Next.js (App Router), create an app/sitemap.ts file to automatically generate a sitemap at build time. Use the MetadataRoute.Sitemap type for type-safe URL list generation.
// app/sitemap.ts
import type { MetadataRoute } from "next"
export default function sitemap(): MetadataRoute.Sitemap {
return [
{
url: "https://example.com",
lastModified: new Date(),
changeFrequency: "daily",
priority: 1,
},
{
url: "https://example.com/about",
lastModified: new Date(),
changeFrequency: "monthly",
priority: 0.8,
},
]
}Manual Creation
For static sites or small sites without a CMS, create the XML file directly in a text editor and upload it to your server root. This is the simplest approach for sites with few URLs. After creation, validate with a W3C validator or our checker tool to catch syntax errors.