How to overcome common errors when implementing the Hreflang tag

What makes SEO so exciting for me is that every day, there’s always something new to learn, implement or test. Working with a huge variety of clients means that often, we may come across something we have less experience with, but this is how we learn and develop new skills.

Recently, with some of our larger international clients, we’ve been focusing on implementing or fixing hreflang tags. These are really useful for multinational sites – our Account Manager Tom Armenante recently wrote a really simple guide to hreflang tags and how to implement them, so this is a great place to start.

Hreflang tags are what you need to use to tell Google how your content relates to itself on an international scale. It basically works like a canonical tag to a translated page that should appear in specific Google location searches.

However, since there are three ways to implement this, a lot of things can go wrong. This article aims to identify the myriad of error messages and issues that you could potentially discover when implementation doesn’t go to plan, because when it does work – it really can work wonders.

The basics you need to get right are the country and language codes; one of the first rules you must remember is that you always need a language code. So for example, you can specify both language and country:

Specify both language and country

These results will be served to users in Great Britain who are searching in English. You can also specify language:

Specify language

These results will return in any country where the query has been made in English. You can even have a generic English language result and a country specific page for the same language:

Country specific

This will return one page for English searchers in Great Britain, and a different “Global Page” for any English query that is done outside of Great Britain. You could even elaborate on this more extensively:

English Australian

This would return the Australian page for English-based queries in Australia, the British page for English-based queries in Great Britain, and the global page for English-based queries from outside Australia and Great Britain. Of course, this will then go on to work in every other language you apply it to.

What won’t work though; is simply using a country code.  In cases where you can only use one code, you must use a language code, this example below will not work:

This example will not work

In cases where the language and country code are the same, the hreflang tag will denote the language, not the country.

Denote language not country

The above example will work, but it means that this is the alternative page for any Spanish-based queries worldwide. It is not denoting that this is the alternative for country-specific Spanish searchers.

Another thing to look out for is whether or not you are using a country or language code that actually exists; for example, if you have an alternative page in Greek aimed at Greece, the correct hreflang tag is:

<link rel=”alternate” hreflang=”http://www.example.com/” hreflang=”el-gr” />

We would always recommend that you check that you have the basics right and one way to do that is by using this tool.

Once you have your hreflang tags live in your link element, there are a few things you can do to check that you have it correct. Firstly, if you want to examine this on a page level, the flang tool can tell you whether or not the links you are pointing to are live.  You can also use tools like the w3.org validator to check if the HTML is all above board.

Another method of checking HTML link element hreflang implementation is by using crawlers. DeepCrawl is a new tool that we have recently started using, it gathers a lot of information but it is also useful for extracting information that other crawlers don’t. When setting a new crawl up you have a variety of options, but for this we need to use the custom extraction regex.

Custom Extraction

Filling it in as in the example above will get you a list of the URLs being pointed at as well as the equivalent language and country codes. This list will be in a format such as:

Country codes

This will allow you to perform two checks. Firstly, you can ensure that all of the URLs you are pointing to are live. You can then manually check if all of the URLs match up with the country they are supposed to.

Depending on your URL structure, you can even use the replace function in Excel to match every single link up perfectly in order to check that not only are the URLs live and country codes correct, but that the links you are pointing to are the right alternative pages too.

When it comes to an xml sitemap however, you can’t use crawlers. Pasting the sitemap into Excel and removing all of the unnecessary HTML padding will work just as well as the above method described for Deep Crawl. However, since the sitemap can be more complicated there are a variety of other issues that can go wrong.

Firstly, check your XML is correct. Either paste your sitemap’s URL, or copy and paste the sitemap itself into the W3schools XML validator. This will tell you specifically where and what is wrong with your code. The next place to look for issues is Webmaster Tools. Here, you can find a variety of issues wrong with your sitemap and Google will explain some of these issues.

The main issues that we see are namespaces not being declared. In Google Webmaster Tools the error would look something like this:

Incorrect namespace

This message shows that you haven’t declared the type of namespace you are going to be using and therefore Google won’t understand the elements that aren’t traditional xml such as hreflang tags.

A good analogy is asking Google to read a different language without giving it a dictionary, the namespace is the dictionary it needs and there are different dictionaries depending on what you are trying to read. When it comes to hreflang tags the correct namespace declaration is the following:

<urlset xmlns=”http://www.sitemaps.org/schemas/sitemap/0.9″  xmlns:xhtml=”http://www.w3.org/1999/xhtml”>

Google gives the same example here. Also, bear in mind that if you are including anything else in your sitemap such as video or mobile versions of the website, you will have to declare those namespaces separately.

Another issue you might come across is an invalid attribute error in Webmaster Tools; it looks something like the example below:

Invalid attribute value

An attribute value is what goes in between the quotation marks, after the attribute itself. So if your attribute is hreflang, a valid attribute value could either be “en” or “en-gb”. This sort of error would be returned if the attribute value was something like “gb” or “gb-en”. It should also be noted that this error could also be returned if the attribute value of rel was misspelled so it wasn’t alternative or if the attribute value of hreflang was not a valid URL.

Another issue is duplicating tags. This is highlighted with the below error message:

Invalid XML

This error arises when a tag that cant be duplicated is present twice within a <url> declaration. The example Google gives above shows how two <loc> tags withing the same url declaration are telling Google that the content is available through two separate URL’s in two separate locations.

You could also have forgotten a simple tag:

Missing tag

The XML validators mentioned before should pick this up and tell you exactly what line it is on.

There are a few other simple errors you could do such as pointing Webmaster tools to the wrong URL or blocking your own sitemap from being accessed through your robots.txt file. For a full list of messages Google Webmaster Tools flags up, you can find it here, but a lot of these do not give hreflang specific examples.

Hopefully, that covers the most common errors – but if there’s anything else at all you’re having trouble with then please get in touch or leave a comment below and we’ll be happy to help out.

Related Posts