A Beginners Introduction To The Canonical Tag
The canonical tag has been supported by major search engines since early 2009, but despite the benefits it offers to both site operators and users, many site operators have yet to implement it. This reluctance can often be attributed to a lack of knowledge regarding the tag, its purpose, and its implementation.
In this article we will examine all three, along with a few common mistakes made in implementing the tag.
What is the Canonical Tag?
The canonical tag is all about duplicate content and preferred content. It is a rather unusual word, but etymologically speaking, it is appropriate. It is derived from canon, which originally referred to biblical or secular rules and laws, a standard for judgement. Later it was used to refer to the works of a writer that had been accepted as authentic.
This last meaning is where it relates to the Internet, SEO and search engines; helping search engines identify which is the original page in relation to duplicated content.
There are many legitimate reasons for duplicate content, particularly when it comes to system generated URLs. These include:
- Multiple URLs – particularly on eCommerce sites where URLs are created through filter options for price, colour, rating, etc.
- Session ID URLs – automatically generated by your system. The same applies to tracking URLs, breadcrumb links, printer friendly versions, and permalinks in certain CMS.
- HTTP, HTTPS & WWW – search engines see http://www.mydomain.com, http://mydomain.com and https://www.mydomain.com as distinct pages, and will crawl (and possibly index) them as such.
- Case – users, and most browsers, treat upper and lower case the same, with the two largely interchangeable. The same is not necessarily true for search engines, so if your website mixes up case in filenames and folder structure, you need to use the canonical tag.
- Mobile URL – when using a special URL (typically m.mydomain.com) for the mobile version of your website.
- Country URL – when using multiple country specific URLs, the content largely remains the same, with only a few minor differences. This does not apply if the language is different, in which case you want the search engines to return separate results.
In most of these instances the content isn’t actually duplicated; these are multiple URLs serving up the same content. True duplication is when the actual content appears on multiple unique URLs (www.mydomain.com versus www.someotherdomain.co), often as a result of content syndication.
The canonical tag should be used in all these instances to tell search engines which is the original content, and which URL should be crawled, indexed and returned on SERPs.
Use of the canonical tag is not mandatory, and Matt Cutts has previously stated that duplicate content rarely results in a penalty, unless it is spam or being used to manipulate rank. Google mostly ignores the duplicate content to avoid having a SERP cluttered with the same results from different URLs, and if it doesn’t the penalty you may suffer is a lower overall rank for all duplicate content.
How to Correctly Use the Canonical Tag
The first thing to do when using the canonical tag is to decide which is the preferred URL, and then add the following mark up to the <head> section of the preferred URL and all its variants:
<link rel="canonical" href="http://www.yourdomain.com/your-preferred-url/" />
NOTE: both relative and absolute paths are acceptable.
The good news is that most CMS have this built in, or at the very least have plugins available that automate most of the process. If you are not using a CMS, Google (and other search engines) have made it a little easier to manage certain processes, including:
- Selective use of 301 redirects.
- Using Google Webmaster Tools to specify how to handle specific URL parameters.
- Adding HTTP headers using PHP and/or .htaccess.
If your site serves the same content on both http and https, and regardless of whether the www was included or not, you may want to consider setting a preferred domain via Webmaster Tools, or a 301 redirect.
Common Mistakes in Implementing the Canonical Tag
Setting the Home Page as the Preferred URL
There are times where your home page will be the preferred URL, but not many. If all your canonical pages point to your home page you risk having none of your pages, aside from your index, crawled and indexed by the search engines.
Using Multiple Canonical Links
Each page must only have one canonical link specified in the <head>, otherwise they will all be ignored. This can occur without you even being aware due to faulty implementation of an SEO plugin, or an improperly edited theme/template.
Placing rel=canonical in the <body>
As with multiple canonical links, if your canonical link appears anywhere but the <head> it is simply ignored. Ideally it should appear as early as possible in your <head> section to avoid any parsing issues.
Using Canonical Links on Paginated Results
SEO can become a little complicated when it comes to splitting content over multiple pages, although it is necessary to do this at times.
When doing this it is advisable to use the rel=prev and rel=next tags instead of rel=canonical to ensure that each page is indexed. Alternatively, if you have also consolidated the content on a single “View All” page, you can rel=canonical to that page instead.
Using Canonical Links in Featured Articles
If your site includes a regularly updated “featured article” or “featured product” you should avoid using the rel=canonical tag on this page specifically. Not doing so, or implementing it incorrectly, can result in this page being ignored and not showing up in search results.
Using Canonical Links Instead of 301 Redirects
Although on the surface the functionality of a canonical link is quite similar to that of a 301 redirect, in terms of metrics they are not. While they both tell search engines to treat multiple pages (or URLs) as a single page, a 301 redirects all traffic to a specific URL and a canonical tag does not.
If your site structure has changed, then a 301 redirect is the preferred option, since it will also correct bookmarks. If your site has duplicate content, but you need to measure traffic to each URL, use a canonical link for the benefit of the search engines.
Although a penalty for duplicate content is unlikely, having search engines index all of your duplicate content affects the relevancy of their results, and could also affect how your pages rank; ultimately affecting your site traffic, and revenue. However, canonical tags are not a magic bullet that will automatically improve your sites visibility, if anything, incorrect implementation of them could affect your overall rank.
If you haven’t introduced canonical tags to your site yet, you must first carefully consider the actual need (do you have duplicate content, or multiple URLs pointing to the same content?), before drawing up a strategy on how to implement, and on what pages. If you have already introduced canonical tags, it may help to revisit the implementation to ensure it was all done correctly, and that you aren’t hiding any pages that should be indexed.