Hreflang in 2025: The Mistakes That Keep Coming Back

Hreflang is one of those signals that looks straightforward in documentation and turns into a minefield in practice. After auditing implementations across dozens of enterprise sites — travel, e-commerce, media, SaaS — the same errors keep appearing. This is a frank look at what breaks and why.

Let me say upfront what hreflang is not: it is not a ranking signal, it does not help you rank better in a given country, and it does not replace a proper localisation strategy. What it does is help Google serve the correct language or regional version of your content to the right user. Get it wrong and Google ignores it. Get it catastrophically wrong and it creates indexation problems you won't notice until weeks later.

The return tag problem

Hreflang requires reciprocal annotations. Every page you declare an alternate for must also reference back. If your English page (en-gb) points to your French page (fr-fr), the French page must point back to the English page. If it doesn't, Google treats the entire cluster as ambiguous and may ignore all annotations.

In theory, this is simple. In practice, it breaks constantly, and for a specific reason: hreflang is usually implemented by a CMS, a template, or a feed. When a new market launches — a new country, a new language, a new regional variant — the return tags often don't get added immediately. They sit in a backlog. Meanwhile, the existing cluster is partially broken and Google is quietly making its own decisions about which version to serve.

Check this first: for any language or region you're targeting, crawl both directions. Confirm page A references page B and page B references page A. Missing return tags are the single most common failure mode I find in audits.

Using the wrong language codes

Hreflang supports ISO 639-1 language codes and ISO 3166-1 alpha-2 country codes. The format is language or language-COUNTRY. Sounds simple. Here's where it goes wrong:

Using language codes only when you need region specificity. en covers all English. If you have separate content for the UK and the US, you need en-GB and en-US — not two pages both tagged en.
Using the wrong case. Language codes are lowercase (en), country codes are uppercase (GB). en-gb and en-GB are both valid in practice, but mixed case from different template engines creates inconsistencies that are hard to audit.
Invented region codes. I've audited sites using en-EU, en-INTL, and en-GLOBAL. None of these are valid. Google ignores them silently.

x-default: misunderstood and misapplied

x-default does not mean "default for all other countries". It is a signal for pages that handle language selection — home pages with language pickers, regional redirect pages, or international landing pages that route users based on browser settings or IP.

Common misuse patterns:

Pointing x-default at the English version when there's no language selector on that page
Adding x-default to every page in the set rather than just the gateway page
Omitting x-default entirely on international landing pages where it's actually needed

The result of misuse is rarely catastrophic — but it means Google has less signal about your internationalisation intent, and the international gateway page may not get treated as such.

Canonical conflicts

The interaction between hreflang and canonicals is where things get genuinely complicated. A page's canonical must point to itself for hreflang to be trusted. If page A canonicals to page B, any hreflang annotations on page A are ignored — Google defers to the canonical version.

This becomes a real problem when:

Paginated pages self-canonical to the first page in the series
Tracking parameters create canonical chains
A CMS automatically adds cross-domain canonicals for syndicated content

Before diagnosing a hreflang problem, always check the canonical chain. A broken canonical is often masquerading as a hreflang problem.

Implementation method: HTML vs sitemap vs HTTP headers

All three implementation methods are valid according to the spec. In practice:

HTML <link> tags in <head>: most common, fine for sites under a few thousand URLs. Gets unwieldy as the language matrix grows.
XML sitemap: better for large sites. Centralised, easier to maintain, doesn't bloat individual page <head> elements. Requires the sitemap to be crawlable and submitted.
HTTP headers: for PDFs, images, and non-HTML resources. Don't use this for HTML pages when you have other options.

Choose one method and apply it consistently across the entire site. Mixing HTML tags and sitemap annotations for different sections of the same site creates ambiguity. I've seen Google appear to partially process both — which is worse than picking one and getting it right.

Before any hreflang implementation: audit your canonical setup first. A broken canonical chain will nullify your hreflang work regardless of how clean the annotations are.

Scale makes everything harder

All of these problems are manageable on a 10-language, 5-market site. On a site with 50 markets, 15 languages, and 200,000 pages — the kind of site I regularly work on — they become systemic. Template errors affect millions of annotations simultaneously. A single CMS misconfiguration can silently break an entire language cluster for weeks.

At that scale, manual auditing isn't viable. You need automated monitoring: crawl-based checks for return tag completeness, canonical conflict detection, and language code validation — run on a schedule, not a one-off basis.

The good news is that hreflang errors are fixable. The bad news is that they don't announce themselves. The only way to know your implementation is healthy is to check it regularly.

← Back to Notes

Mags Sikora

Freelance SEO Consultant, SEO Director

Senior SEO Strategist with 18+ years leading search programmes for enterprise and global digital businesses. Director of SEO at Intrepid Digital.