This post is a first for me. First time there is a guest post (well semi-guest) on this site. It also is my first collaboration with one of my favourite Research SEOs Neyne. Neyne (Real name Branko Rihtman) doesn’t blog very often, but when he does it is always worth a read. This is a two part post, the first by Neyne, with the second part by yours truly.
My last post was about using WordPress Plugin Flaws to link build, “aka soft hacking”. However what we are about to demonstrate is another opensource CMS, Joomla, has just as big a flaw as WP. We didnt investigate the backdoor, or how it was done, however we do demonstrate the extent to which it works.
Worse Than Blackhat, Meet The Hacker SEO
Just like with “SEO is Dead” debate that raises its lame head in seemingly regular intervals over the past few years, so does its not-so-distant cousin, the “Whitehat vs. Blackhat” debate. There has been one raging on the popular blogs in the last week or so and, just like with its useless relative, this round did not bring any new arguments nor has it convinced anyone on the either side of the argument. However, not often does one get to encounter a true black hat campaign, one that leaves you with no doubt as to whether it is useful or not nor whether it is illegal or not. Thanks to a tip from one of my SEO buddies, I have taken the glimpse into the eyes of the beast, and it ain’t pretty.
Just before we dive in, I want to make something clear. I don’t usually out websites or SEO techniques. I think that outing is a cowardly practice, done by people that are not capable of outperforming others. Or in the immortal words on one of Aaron’s tshirts: “I have a very high tolerance for spammers, but a very low one for weasels”. That said, the techniques outlined in this article are most probably illegal (not a lawyer, so don’t want to be definite on that one). They include hacking into other people’s sites, flagging them as pill-related, squandering their link equity and eventually getting them flagged as compromised in Google SERPs, thus seriously decreasing their CTRs. Asshatery like that should be eliminated and I feel no remorse for doing so.
It all started with an enquiry of the mentioned friend about one of his client’s sites. The site seemed to be OK, nothing irregular about it; however, when looking at the Google cached version of the site, a footer appeared:
This footer does not appear when the site is visited with Googlebot useragent, so my guess is that this is a case of IP cloaking. The more interesting thing is that none of the sites linked in the footer seem to be V1@6r@ related. They are regular sites on a wide range of topics. So my first thought was that this is a hatchet job – a slimy SEO company that is trying to ban their competitors by creating thousands of artificial, spammy links on hacked sites. However, when looking at the source code of Google cache of each of the linked sites, a different picture started to emerge. Check out the differences between the <header> element as it appears on the live site vs. how it appears in Google Cache:
So my next question was whether these site rank for any of the linked phrases. Almost all of them did. Check out this SERP for [V1@6r@ price] (6600 Global Exact Match monthly searches)
So here came a head scratching part. It seems like someone is hacking into Joomla based sites, planting links in their footer to other hacked Joomla sites, whose header is cloaked to show V1@6r@-related keywords. But what is the point? Why would someone send V1@6r@-relevant traffic to totally unrelated websites? Then I clicked through to the site from the above SERP. This is the site I got:
If you go to the site directly, by typing the URL into the address bar, this is what you get:
So not only are they doing IP cloaking, they are also doing referral cloaking to show all visitors referred from Google SERPs . Here is a partial list of sites, with their original Titles, hacked Titles, keyword they targeted with footer links anchors and their ranking on Google.com for that keyword:
There is one thing that is common to all the websites in question – they have been all created in Joomla. Furthermore, it is easy to target them as there is a clear indication they are Joomla based in their header:
<meta content="Joomla! 1.5 - Open Source Content Management" />
So Neyne has shown you the what, how and why. Hacking these many sites for those rankings isn’t an easy job, unless you prebuild in hacker doorways as I demonstrated in the WP Plugin Security fail. The only other way to do this is to run a number of brute force scripts on known weak spots of various servers and CMS’s. I want to show you what I learnt from investigating those links with Neyne. Like I said with the JC Penney scenario, when you get a chance to learn, do it.
10 Things I Learnt About The V1@6r@ Link Hackers
1. Old spam tactics still work
A while ago, I wrote about Spam Tactics, Then and Now, where I identified a number of tactics that still work. This discovery reinforces what I learnt back then, that old spam tactics dont die, they just resurface. And that Google isnt really as sophisticated an algo that people believe it to be. Some of the points below take this into more detail…
2. content is not king
None of these sites that we investigated were serving up content that was V1@6r@ related. Of course quite a few had cloaking which meant that some conteant was being shown, but after investigating a number of these sites, not all had redirection or cloaking set up as yet. And as a result just had links that were doctored. So why did they rank for these keywords?
Just links. Links, links and more links. What about great content? Nope. Links.
Using Majestic, lets look at what the links could be like:
3. anchor text over rules all
Relevancy, thematic links, semantic analysis etc etc can all go to pot if you are working with a large scale access to link text manipulation system. Doesn’t matter where they are placed, and doesn’t matter where they came from.
An advanced analysis of the anchors for some of the sites we looked at gave you the wordle above – you can see how heavy the manipulation is. In raw terms:
4. footer links work
For a while SEOs have been devaluing the relevance of links in footer or common elements – ummm they seem to work.
5. sitewide links work
Again, we get arguments that the value of sitewide links have been dampened greatly. Not when you are working in volume, as we discovered when we investigated these sites.
6. referrer cloaking still works
I think Neyne demonstrated this pretty well above.
Another spam tactic from the past, still live and well.
7. i need to set up alerts
What really shocked me is that these site owners still haven’t realized that they rank for these keywords. If you suddenly rank for or get traffic from didgy keyphrases, its time to check WTF is going on. Now in the case of user agent redirection, sometimes analytics will not record those visits. But will most certainly show up for high volume impressions if you are signed in with Google Webmaster Tools. AND they have a malware detection piece on there which is worth looking at once in a while.
8. i need to monitor catch all accounts
Google does try and email those sites that they have flagged up :
But you need to monitor and even set up catch all email accounts: You can find out if your site has been identified as a site that may host or distribute malicious software (one type of “badware”) by checking the Dashboard in Webmaster Tools. (Note: you need to verify site ownership to see this information.) We also send notices to webmasters of affected sites at the following email addresses for the site:
9. edu sites need some serious help
As part of the investigation, I had to scan a large number of SERPs for v1@6r@ related keywords. The most common resulting domain extension? That would be the “.edu”. Google and/or someone else needs to teach these guys how to secure their sites… It’s not hard to spot the volume of hacking – see this simple query.
Or look at this gem:
.gov Sites Are FUBAR
Another common domain extension that shows up in the SERPs is the .gov extension. By the way, did you know google has an old search page that only looks at Government sites? Look what I found through it: http://bit.ly/dOlzKR