XML Sitemap Counter & Analyzer

Because opening 47 child sitemaps in browser tabs is not a workflow. This is the script you'd write — except you don't have to.

Paste the URL here — it follows every child, counts the URLs, breaks them down by directory, flags duplicates, checks lastmod coverage, and gives you a summary you can copy straight into the audit doc.


Works with both <urlset> sitemaps and <sitemapindex> files. Index files are followed automatically.

Fetching sitemap…

How to Use This Tool

Paste a URL, get a count. Here's the full workflow.

01

Paste the sitemap URL

The full URL — /sitemap.xml, /sitemap_index.xml, whatever your CMS generates. If you don't know where it is, check robots.txt. There's almost always a Sitemap: directive at the bottom.

02

Hit Count URLs

The tool fetches the XML. If it's a sitemap index, it follows every child <loc> automatically — five at a time, in parallel. You'll see a progress bar tracking which child it's on.

03

Read the output

You get the full picture in one view: total and unique URL counts, duplicate detection, lastmod coverage with staleness flags, and a dominant directory callout if one section dominates the sitemap. Switch between the directory breakdown and per-sitemap split using the tabs. Copy the report or export all URLs as a plain list.

Firewalled or staging sitemaps?

If the URL mode can't reach your sitemap (staging environment, IP restrictions, localhost), switch to Paste XML and drop in the raw XML directly. Works for <urlset> sitemaps — if you paste a <sitemapindex>, child sitemaps won't be fetched. For index files, use URL mode or paste each child sitemap individually.

When You'd Actually Reach for This

This isn't a tool you run every day. It's the tool you reach for when someone needs a number and you don't want to write a script to get it.

Technical audits

How many URLs is this site submitting via sitemaps? If the number is wildly different from the number of indexable pages, something is wrong — noindexed pages in the sitemap, redirects, parameter URLs that shouldn't be there.

Pre-migration baseline

Get a count before the migration. Run it again after. If the post-migration number is dramatically different, something got dropped or duplicated. The per-directory breakdown makes it obvious which section lost URLs.

Sitemap bloat

200,000 URLs in the sitemap but only 40,000 get traffic. Ecommerce sites are prone to this — product variants, filtered category pages, out-of-stock items accumulating. The directory breakdown shows where the bloat is concentrated.

Competitor analysis

Most sitemaps are public. Paste a competitor's sitemap URL and see how many URLs they're submitting, how their content is structured across directories, and how recently their sitemaps were updated.

XML Sitemaps and What This Tool Analyzes

An XML sitemap is a structured file that lists URLs you want search engines to discover. It's not a crawl instruction — it's a hint. This tool counts every <loc> element, detects duplicates across child sitemaps, breaks URLs down by top-level directory, checks <lastmod> coverage and staleness, and flags when a single directory dominates. The number of URLs in your sitemap is not the same as what Google has indexed — that's a different question, and you'd need Search Console for it.

Single sitemaps

Uses <urlset> as its root element and lists URLs directly. Each sits inside a <url> block with a required <loc> and optional <lastmod>, <changefreq>, and <priority>. The tool counts every <loc> here.

Sitemap indexes

Uses <sitemapindex> as its root and lists child sitemaps instead of individual URLs. The tool detects this, fetches every child automatically, then counts the <loc> elements across all of them. One child per content type is the typical pattern on large sites.

Limits

Each individual sitemap can contain a maximum of 50,000 URLs and must not exceed 50 MB uncompressed. If you hit either limit, split across multiple files and reference them from a sitemap index. No hard limit on how many children an index can have.

Lastmod

The <lastmod> tag tells search engines when a URL was last meaningfully changed. Google uses it — but only if it's accurate. If your CMS updates <lastmod> on every page when you change a footer, Google will stop trusting it.

Sitemap count ≠ indexed count.

Google decides what to index, not your sitemap. If you're submitting 50,000 URLs but only 12,000 are indexed, the gap is worth investigating — but this tool gives you the submission side of that equation. For the indexed side, check Search Console's index coverage report.

Frequently Asked Questions

How many URLs should be in my sitemap?
As many as you have indexable pages — and no more. Every URL in your sitemap should return a 200 status, have a self-referencing canonical, and not be blocked by robots.txt or noindex. If your sitemap count is much higher than your indexed page count in Search Console, that gap is worth investigating further — use GSC's index coverage report to find out what's being excluded and why.
Where do I find my sitemap URL?
Check yourdomain.com/robots.txt — most sites declare their sitemap there with a Sitemap: directive. If it's not there, try /sitemap.xml, /sitemap_index.xml, or check Google Search Console under Sitemaps. WordPress sites usually generate theirs at /wp-sitemap.xml or via a plugin like Yoast at /sitemap_index.xml.
Does this tool fetch sitemaps from my server?
Yes — when you use the URL mode, the tool fetches your sitemap through a lightweight proxy. The proxy reads the XML and passes it to your browser for parsing. No data is stored. If you'd rather not have anything fetched, use the Paste XML mode and paste the raw XML directly.
Why is my URL count different from what Google shows as indexed?
This tool counts URLs in your sitemap — what you're submitting. Google's index count is what they've chosen to index. The two numbers are almost never the same. Google may not index every URL you submit (thin content, duplicates, low quality), and it may index pages you didn't include in your sitemap (discovered through links). The gap between the two is one of the most useful things to investigate in a technical audit.
Can I count URLs in a gzipped sitemap?
Not directly — the tool handles plain XML. If your sitemaps are served as .xml.gz, you'll need to decompress them first. Most browsers will do this automatically if you open the URL directly, so try pasting the URL — the server may serve it decompressed. Otherwise, download, decompress, and use Paste XML mode.
What does the older than 1 year lastmod warning mean?
It means URLs in your sitemap have a <lastmod> date more than 12 months old. That's not inherently a problem — some pages genuinely don't change. But if a large percentage of your sitemap has stale lastmod dates, Google may deprioritise recrawling those URLs. If the content has been updated but the lastmod hasn't, your CMS isn't updating the timestamps correctly.