Redirects using HTTP 301 headers
This is Part 1 of 2. Jump to Part 2 to get code examples of 301 redirection.
Online, cool links don't change. However, moving pages are a common feature on the web, and although there are many ways to redirect a web page, there is only one correct way to do it to be search engine friendly. This tutorial explains the background in detail and then tells you how redirection should be done, using 301 redirects.
At the end, you should have learned enough about the this side of how the internet works to understand what you are doing and why you are doing it. Yes, it will get technical, but there is no point in learning a little bit of this important lesson.
Canonical domains and duplicate content
What is are canonical domains? They are more commonly known as subdomains, and are defined as part of the domain name system (DNS). For example, in ftp.domain.com, the 'ftp' part is the subdomain. The DNS entry for domain.com would show that it resolves to a certain IP address: that is, domain.com is hosted on given computer with the provided IP address.
Canonical domains are usually used to provide aliases to a domain name. To take our example again, ftp.domain.com would typically also resolve to the same IP address as domain.com (after all, you upload a website to where it is hosted). Other examples include mail.domain.com, ssh.domain.com, and, most importantly for us, www.domain.com.
However, these are not the only uses. Google, for example, uses subdomains for each of their services: print.google.com, gmail.google.com (which forwards somewhere else), and others. Thus subdomains can be completely different websites.
Here is the problem: To humans and the DNS system, http://www.domain.com and http://domain.com point to the same website, and are one and the same. However, to search engines, because subdomains are different entities, http://www.domain.com and http://domain.com are two different websites with the same content. This means that they both will be penalized for duplicate content and will rank lower.
Sabotage: it gets worse!
Because of this problem, a competitor of yours can find places to link to your website, but use both www and non-www links. This makes sure that search engines find both copies and duly knock you down in their rankings. This is called '301 sabotage', and a bit more information about it can be found at this threadwatch.org posting. Claus Schmidt wrote an excellent article about the dangers of improper redirection.
Does your website need protection?
To test if your website needs protection, do the following. Go to http://www.yourwebsite.com/ and watch the address bar: did you end up at www.yourwebsite.com or at yourwebsite.com? Note that, and now type in http://yourwebsite.com: where did you end up this time? If in both tries you ended up at the same place (either www.yourwebsite.com or yourwebsite.com), then your website is fine. It doesn't matter where you end up, as long as both are the same.
To take a real-world example, both www.ekstreme.com and ekstreme.com end up at ekstreme.com (no www).
In the test above, if you do not reach the same final destination, then your website is open to abuse, whether intentional or not. If so, you need to do a 301 redirect.
301 redirection quick tutorial
The correct way to forward your domain name in this context is to use what is known as the HTTP 301 redirection header. The what? Let me explain...
The HTTP protocol defines a set of headers that allow the internet to function. HTTP headers are usually not seen by website visitors, but are part of the communication between the web server and the browser. They are called headers because they are exchanged before the web server sends the HTML page (or any other file) to the browser; that is, they appear at the head - beginning - of the HTML document.
A subset of the standard HTTP headers deal with redirection; these are called the 3xx headers (because they are numbered 300, 301, 302, and so on). The HTTP 301 header means that the web page has moved permanently, and it is always followed by another header defining the new location of the web page.
To everyone using the website now, including search engines, this means that the web page or domain name should no longer be used. Instead the new location should be used. Thus when a search engine comes across a forwarding, it automatically indexes it as the destination page; that is, the search engine now knows that www.domain.com and domain.com are the same thing.
Next page: Part 2: get code examples of 301 redirection.
Further reading
So what do the search engines say about this subject? Here are some pages I've found:
- What to do when your site moves from MSN search.
- Changing domain names from Google.
- Why does my site have two different listings in Google: http://site.com and http://www.site.com? from the webmaster troublshooting section at the Google help center.
- Inside Google Sitemaps: www vs non-www versions of a site. Tips specifically for Google Sitemaps users.
- Matt Cutts' Bacon polenta blog post. Funny name for a post from the famous Google engineer on this subject.
- Read all about the HTTP protocol, in particular, section 10.3, which talks about redirection.
