Cleaning Up After Google’s URL Mess

POSTED BY socialdude on Dec 17 under rankings

Post image for Cleaning Up After Google’s URL Mess

Recently Google announced the release of their own URL shortening service, Goo.gl. If we put aside the data gathering aspects for a moment, this product is the third in a series of products that make the web a more difficult place to crawl. In this post we’ll follow a Goo.gl URL through its course and look at the trail of junk each piece leaves in its wake.

Starting with the Goo.gl URL using the SeoConsultants header checker, this is the path we see. It passes through three different 301 redirects.

http://goo.gl/fb/gNHq

#1 Server Response: http://goo.gl/fb/gNHq
HTTP Status Code: HTTP/1.0 301 Moved Permanently
Content-Type: text/html; charset=UTF-8
Location: http://feeds.feedburner.com/~r/Wolf-howl/~3/D-BSa1LCo88/?utm_source=feedburner&utm_medium=twitter
Expires: Tue, 15 Dec 2009 02:18:06 GMT
Date: Tue, 15 Dec 2009 02:18:06 GMT
Cache-Control: private, max-age=86400
X-Content-Type-Options: nosniff
X-XSS-Protection: 0
X-Frame-Options: SAMEORIGIN
Server: GFE/2.0
Redirect Target: http://feeds.feedburner.com/~r/Wolf-howl/~3/D-BSa1LCo88/?utm_source=feedburner&utm_medium=twitter

#2 Server Response: http://feeds.feedburner.com/~r/Wolf-howl/~3/D-BSa1LCo88/?utm_source=feedburner&utm_medium=twitter
HTTP Status Code: HTTP/1.0 301 Moved Permanently
Location: http://www.wolf-howl.com/google/google-personalized-search-news/?utm_source=feedburner&utm_medium=twitter&utm_campaign=Feed%3A+Wolf-howl+%28Graywolfs+SEO+Blog%29
Content-Type: text/html; charset=UTF-8
Date: Tue, 15 Dec 2009 02:18:06 GMT
Expires: Tue, 15 Dec 2009 02:18:06 GMT
Cache-Control: private, max-age=0
X-Content-Type-Options: nosniff
X-XSS-Protection: 0
Server: GFE/2.0
Redirect Target: http://www.wolf-howl.com/google/google-personalized-search-news/?utm_source=feedburner&utm_medium=twitter&utm_campaign=Feed%3A+Wolf-howl+%28Graywolfs+SEO+Blog%29

#3 Server Response: http://www.wolf-howl.com/google/google-personalized-search-news/?utm_source=feedburner&utm_medium=twitter&utm_campaign=Feed%3A+Wolf-howl+%28Graywolfs+SEO+Blog%29
HTTP Status Code: HTTP/1.1 200 OK
Date: Tue, 15 Dec 2009 02:18:06 GMT
Server: Apache/2.2
X-Powered-By: PHP/5.2.0-8+etch15aaa+tigertech1
Vary: Cookie
X-Pingback: http://www.wolf-howl.com/xmlrpc.php
WP-Super-Cache: WP-Cache
Connection: close
Content-Type: text/html; charset=UTF-8

I can tell you this is not a condition I would personally ever recommend. I can also tell you from first hand personal experience I have sat on panels with Google employees where we all but held hands, sang Kumbaya, and preached about the value of “pretty” URL’s and the importance of keeping things easy for search engine crawlers to understand. This is why it’s so mind-boggling that a search engine is behind the current situation of things.

Back to the case at hand: the Goo.gl URL 301 redirects to a feedburner feedproxy URL. This is where things start to get screwy. (Irony: feedburner is a Google company.) Since you know the final URL, there’s no need for the intermediate URL step. The program should share the data and pass you straight to the destination without taking long road, if you know what I mean.

http://feeds.feedburner.com/~r/Wolf-howl/~3/D-BSa1LCo88/?utm_source=feedburner&utm_medium=twitter

On to step number two. If you are using feedburner and Google analytics, Google appends a bunch of tracking parameters, making your URL a big fat ugly mess like this:

http://www.wolf-howl.com/google/google-personalized-search-news/?utm_source=feedburner&utm_medium=twitter&utm_campaign=Feed%3A+Wolf-howl+%28Graywolfs+SEO+Blog%29

IMHO this is where the train completely jumps off the tracks. There’s no need for the program to make things that complicated to crawl or to  create duplicate content issues. Again both are Google services and should exchange the info behind the scenes. And NO, the rel=canonical is not a solution. Band aid solutions are not the answer to bad architecture.

The last hop comes from a clean up plugin I have from Joost de Valk that redirects you to a clean URL with all of the query-string garbage removed. But I wouldn’t need this step if the first two steps hadn’t created problems in the first place.

So how bout it, Google? Instead of releasing 38 new products in 70 days, put more resources into fixing the messes some of your existing ones are creating.

Warning Conspiracy Theory Ahead

shutterstock_28131340

Now some people might make the case that because Google has all of this data they are eventually able to sort it out back of the house and it’s not an issue at all. However, by creating all of this unnecessary complexity at the URL level, Google is intentionally making the web a more difficult place for less sophisticated crawlers, spiders, and search engines to deal with. Basically they are laying down land mines to slow or trip up the competition.  But that’s not something we would ever see come from the home of lava lamps and bean bags chairs, right? Right …

Advertisement: Efficiently manage your SEO and Social Media campaigns with Raven’s powerful suite of Internet Marketing Tools

This post originally came from Michael Gray who is an SEO Consultant. Be sure not to miss the Thesis Wordpress Theme review.

Cleaning Up After Google’s URL Mess

Related posts:
  1. Cleaning up After the Dave Pasternack Mess Now that the Dave Pasternack SEO contest is over it’s...
  2. Google URL Searches I’m not sure if Google is crazy stupid or crazy...
  3. Rebranding Your URL Can Decrease Your Usability Like anyone in the search space who does some background...

Related posts brought to you by Yet Another Related Posts Plugin.

Leave a Comment

If you would like to make a comment, please fill out the form below.

You must be logged in to post a comment.

Copyright Rankings.me | Powered by WordPress | Using the GreenTech Theme