Hreflang is one of those signals that looks straightforward in documentation and turns into a minefield in practice. After auditing implementations across dozens of enterprise sites — travel, e-commerce, media, SaaS — the same errors keep appearing. This is a frank look at what breaks and why.
Let me say upfront what hreflang is not: it is not a ranking signal, it does not help you rank better in a given country, and it does not replace a proper localisation strategy. What it does is help Google serve the correct language or regional version of your content to the right user. Get it wrong and Google ignores it. Get it catastrophically wrong and it creates indexation problems you won't notice until weeks later.
The return tag problem
Hreflang requires reciprocal annotations. Every page you declare an alternate for must also reference back. If your English page (en-gb) points to your French page (fr-fr), the French page must point back to the English page. If it doesn't, Google treats the entire cluster as ambiguous and may ignore all annotations.
In theory, this is simple. In practice, it breaks constantly, and for a specific reason: hreflang is usually implemented by a CMS, a template, or a feed. When a new market launches — a new country, a new language, a new regional variant — the return tags often don't get added immediately. They sit in a backlog. Meanwhile, the existing cluster is partially broken and Google is quietly making its own decisions about which version to serve.
Using the wrong language codes
Hreflang supports ISO 639-1 language codes and ISO 3166-1 alpha-2 country codes. The format is language or language-COUNTRY. Sounds simple. Here's where it goes wrong:
- Using language codes only when you need region specificity.
encovers all English. If you have separate content for the UK and the US, you needen-GBanden-US— not two pages both taggeden. - Using the wrong case. Language codes are lowercase (
en), country codes are uppercase (GB).en-gbanden-GBare both valid in practice, but mixed case from different template engines creates inconsistencies that are hard to audit. - Invented region codes. I've audited sites using
en-EU,en-INTL, anden-GLOBAL. None of these are valid. Google ignores them silently.
x-default: misunderstood and misapplied
x-default does not mean "default for all other countries". It is a signal for pages that handle language selection — home pages with language pickers, regional redirect pages, or international landing pages that route users based on browser settings or IP.
Common misuse patterns:
- Pointing
x-defaultat the English version when there's no language selector on that page - Adding
x-defaultto every page in the set rather than just the gateway page - Omitting
x-defaultentirely on international landing pages where it's actually needed
The result of misuse is rarely catastrophic — but it means Google has less signal about your internationalisation intent, and the international gateway page may not get treated as such.
Canonical conflicts
The interaction between hreflang and canonicals is where things get genuinely complicated. A page's canonical must point to itself for hreflang to be trusted. If page A canonicals to page B, any hreflang annotations on page A are ignored — Google defers to the canonical version.
This becomes a real problem when:
- Paginated pages self-canonical to the first page in the series
- Tracking parameters create canonical chains
- A CMS automatically adds cross-domain canonicals for syndicated content
Before diagnosing a hreflang problem, always check the canonical chain. A broken canonical is often masquerading as a hreflang problem.
Implementation method: HTML vs sitemap vs HTTP headers
All three implementation methods are valid according to the spec. In practice:
- HTML
<link>tags in<head>: most common, fine for sites under a few thousand URLs. Gets unwieldy as the language matrix grows. - XML sitemap: better for large sites. Centralised, easier to maintain, doesn't bloat individual page
<head>elements. Requires the sitemap to be crawlable and submitted. - HTTP headers: for PDFs, images, and non-HTML resources. Don't use this for HTML pages when you have other options.
Choose one method and apply it consistently across the entire site. Mixing HTML tags and sitemap annotations for different sections of the same site creates ambiguity. I've seen Google appear to partially process both — which is worse than picking one and getting it right.
Scale makes everything harder
All of these problems are manageable on a 10-language, 5-market site. On a site with 50 markets, 15 languages, and 200,000 pages — the kind of site I regularly work on — they become systemic. Template errors affect millions of annotations simultaneously. A single CMS misconfiguration can silently break an entire language cluster for weeks.
At that scale, manual auditing isn't viable. You need automated monitoring: crawl-based checks for return tag completeness, canonical conflict detection, and language code validation — run on a schedule, not a one-off basis.
The good news is that hreflang errors are fixable. The bad news is that they don't announce themselves. The only way to know your implementation is healthy is to check it regularly.
← Back to Notes