You may have read a few SEO articles that mention canonicalization. But do you know what it is and why you should care?
If not, then read on. It’s an important concept in SEO.
In fact, if you don’t canonicalize your web pages, you’ll likely limit your online reach.
What You’ll Learn
The Definition
Canonicalization is an optimization technique that prevents search engines from registering duplicate content on the web.
What is canonicalization? (source)
Remember: duplicate content is a big no-no. If Google thinks you’re copying somebody else’s content, your website’s rank will suffer.
It might even disappear from the search results.
But it’s not just about plagiarizing. You also don’t want to duplicate your own content.
Why? Because search engines get confused about which page to index. Search bots don’t want to index multiple URLs that have the same content for obvious reasons.
So when you implement canonicals on your website, you’re making it easier for search engines to eliminate duplicate content and index the “right” pages.
That’s a win for both you and them.
Duplicate Content on My Own Site?
After reading the previous section, you might be thinking to yourself: “I don’t have duplicate content on my site. I don’t need to read any more about canonicalization.”
Are you sure about that?
In fact, you might have duplicate content and not even know it.
For example, if your website works with both the www prefix and without the www prefix, then you could have two URLs with the same content:
-
-
- https://mysite.com/my-content
- https://www.mysite.com/my-content
-
To search engines, those are two different URLs with the same content. That means you have duplicate content on your site.
Or maybe you’re using UTM parameters to track how people found your site. Take a look at these two URLs, for example:
-
-
- https://mysite.com/my-content
- https://mysite.com/my-content?utm=Facebook
-
Once again, there are two URLs there that point to the same content.
If you’re running an ecommerce site, it’s really easy to have two or more URLs pointing to identical product detail or category pages. That’s because ecommerce engines make frequent use of request parameters. Consider these two URLs that reference the same content:
-
-
- https://myecommercesite.com/white-t-shirt
- https://myecommercesite.com/white-t-shirt?size=large
-
In all of the cases mentioned above, the best thing to do is to use canonicals to point to the “correct” web page.
How Does It Work?
There are a few different ways to implement canonical URLs.
First, you can go into Google Search Console and specify your preferred canonical domain. That’s pretty easy to do and it covers the entire site.
The downside to that approach is that it only solves domain-specific issues, like the www vs. non-www URL I covered in the previous section.
Adding your preferred canonical URL in Google Search Console
Also, that only works for Google.
Another approach is to use the rel=canonical HTTP header. That involves setting a name/value pair in the HTTP response.
That’s a fairly complicated answer to the problem, though. Encoding response parameters requires advanced web development.
Yet another option is to use 301 redirects. That’s a “forward” that sends visitors from one URL to another.
So if somebody visits https://www.mysite.com/my-content, that person will get redirected to https://mysite.com/my-content.
However, it’s best to go the route only if you’re moving a web page from one address to another. It’s not ideal for canonicalization.
The most common approach for canonicalization involves adding the rel=canonical tag to the header of the HTML document.
What Is a rel=canonical Tag?
It looks funny, doesn’t it? That’s because it’s the easiest way to abbreviate the tag when writing about it.
Simply put, the rel=canonical tag is an HTML element that tells search engines about the original source of content. You can think of it as a citation.
What does a rel=canonical tag look like? (source)
When you add a rel=canonical tag, you’re telling search bots, “This isn’t the original content. The original content is located at this other URL.”
As a result, the search engines won’t treat that content as duplicate. You’ll avoid an SEO penalty.
By the way, here’s what the tag looks like:
<link rel=”canonical” href=”https://mysite.com/my-content”/>
As you can see, it’s a <link> element with the rel attribute set to “canonical.” The href attribute is set to the URL of the original page.
How Search Bots Treat the rel=canonical Tag
When search bots see the rel=canonical tag, they don’t index the page.
Isn’t that bad? No.
That’s because search bots instead index the page identified by the href attribute. In the example above, search bots would index https://mysite.com/my-content.
So your correct page gets indexed. The page with the duplicate content doesn’t get indexed.
That is exactly what you want.
It also happens to be what the search engines want.
Now What?
Now that you know about canonicalization and how it works, should you use it on your website?
In a word: yes.
As we’ve seen, it’s best to implement canonicalization with the rel=canonical tag.
Keep in mind: if you have thousands of pages on your website, then you’ll need to update thousands of pages.
But that might be easier than you think.
Adding Canonicalization With Yoast SEO
Chances are pretty good that you’re using WordPress as your content management system (CMS) of choice. If that’s the case, then it’s a snap to add canonicalization to your site.
Just download the Yoast SEO plugin and install it on your site.
And that’s pretty much it.
Really. That’s all you need to do.
Why? Because Yoast is smart enough to add canonicals to all your web pages.
If you want proof, just visit your website and right-click on the page. Then, select “View Source” from the context menu that appears.
Use Ctrl+F to search for text. Search for “<link”.
You should see something like this:
<link rel=”canonical” href=”https://mysite.com/my-content”/>
Note: you may have to pass a few other <link> elements before you get to the rel=canonical.
When you see the rel=canonical on your web page, that’s proof that Yoast added it for you.
In fact, Yoast will add it to all your pages. You don’t have to do anything else.
However, Yoast does give you the ability to customize the canonical. To do that, just edit your page or post and scroll down to below the content.
In the Yoast SEO section, click on the Settings icon. It’s shaped like a gear.
Yoast SEO advanced settings for automated canonicals (source)
Scroll down to the bottom of the tab and you’ll see a field where you can enter the canonical URL. Put the correct URL in there and click the Update button in the left-hand sidebar.
Yoast will use the URL you entered as the canonical instead of the default URL.
Wrapping It Up
It’s important to understand canonicalization if you’re serious about optimizing your website for search. Otherwise, you could limit your site’s visibility in the search results.
Once you do understand it, it’s best to use canonicalization on all your pages to prevent duplicate content issues. Fortunately, tools like Yoast SEO make it easy to do that.