It’s not hard to Spam Search Engines. Seriously, it isn’t. Especially when working on the Long Tail of Search Spam. I mean that is what SEO Automation is all about. However to do so, means falling foul of search engine guidelines, especially that of google.
One of the most traditional methods of SERP Spamming was the use of the site search function – using searches within your site to create hundreds of auto generated pages that were left open to Google to index. Note – this isn’t NOT advisable, unless you want to be hit by Google for spamming, after all the recent case of JC Penney SEO Fiasco should have taught us that Google will penalize right?
According to google guidelines:
Use robots.txt to prevent crawling of search results pages or other auto-generated pages that don’t add much value for users coming from search engines.
If you think that isn’t enough to deter you. Matt Cutts wrote about it too:
As a result of that question, YouTube added a “Disallow: /results” line in its robots.txt file. That’s good because as Google recrawls web pages, we’ll see that and begin to drop those search results.
Great, so google doesn’t want to intentionally index its own results. This does not mean it DOESN’T happen. In fact, Vanessa Fox covered a piece about Google Spamming search with Google Translate results:
I asked Google about this and they confirmed that indeed it was simply a matter of the Google Translate team not being aware of the issue and said they would resolve it.
I typically run 1000’s of long tail queries every week, mostly for fun. It keeps my mind set fresh, and often highlights gaps, issues and gives me great ideas. As a result of this, guess what I uncovered?
Yet Another Google Property Spamming SERPs
As you can see, Google has about 4,350,000 results indexed! All the top results are from Chinese queries (I think) and are indexed via “deskbar.google.com/news/more?”. Now the “/more?” url is intended to be a collection of stories from google news that are reached via the google Desktop application.
http://deskbar.google.com/news/ is virtually the same site (that I can tell) as http://news.google.co.uk/ ( I am no Michael VanDeMar who actually looks deep into issues – I don’t have the investigative skill, so I will let others do this for themselves J )
Aaron Wall write about this issue in December 2010. Back then, there were just over 2.6 million results from this domain. As you can see, the results have doubled so far.
So is this a BIG Deal?
Simply put, yes it is. As I demonstrated with SERP Sniffing, the long tail of search is pretty valuable. With over 45 million results indexed, with more being added every minute, the value of the long tail is pretty high to this domain. Let me show you a few:
I don’t think these have any commercial value. And apart from being a “collation” source for news related to these long tails, I don’t think they add value. Or maybe they do. I don’t know. I do know if any other site was collating news like this and ranking, they would quickly be dealt with.
Well in the Vanessa Fox Article I linked to earlier has Robots Txt instructions in avoiding this sort of behavior for your own sites. I sincerely hope that Google stumble upon this post and add do the same. I advice you to do the same.
Side Note: Surely google should issue a set of guidelines to all those that manage such google properties? And should they have nocticed traffic coming in for those many SERPs?
Rishi Lakhani is an independent Online Marketing Consultant specialising in SEO, PPC, Affiliate Marketing and Social Media. Explicitly.Me is his Blog. Google Profile