Is your Hosting Leaving you vulnerable to duplicate spam?

by rishil on July 31, 2013

Most of us that aren’t extremely technically minded tend to fall into a series of errors that our more savvy friends would be able to help us out of. And if we, those who are in the web / online marketing industry get caught out, then I wonder what happens to the less savvy population?

Considering the fact that Google is getting a lot more aggressive with duplicate content – imagine my amazement when Chris at Hitreach (Webdev and SEO in Dundee who I work with on occassion) told me about a recent clean-up he had to do. And how it came about. (Note: When I say “more aggressive” I mean that I am seeing more an more sites in the last 2 years getting hurt by too much dupe content. A FULL site duplicated isn’t ideal from any standpoint. )

The Background – Rogue SubDomains

Whilst undertaking a site audit on their own site they began discovering rogue URL’s which were indexed in Google such as http://w.hitreach.co.uk

This isn’t a subdomain which actually existed which meant someone who’d incorrectly linked to www.hitreach.co.uk had caused a rogue subdomain to become indexed.

What are the issues with such subdomains? Well to start off with, they create a complete clone of your website. And with a single link, CAN be indexed in Google overnight.

I know, cause I tried. I managed to get 10 variations of a random subdomain name of a site hosted at the same hosting provider to index within 24 hours.   That is 10 duplicate versions of the site overnight.

As an SEO, you can imagine my horror at thinking how easy it would be to:

  1. Index hundreds of variations of a sites clones on sub domains with a simple high volume SENuke attack.
  2. Create dodgy subdomains with high risk words, adult, pharma, gambling etc.
  3. Potentially OUTRANK the home page with those dodgy subdomains, and causing the site serious ranking issues.

Heart Internet – Default Settings Leave Clients at Risk

It turns out that any reseller or Hybrid/VPS hosting account which is created on Heart Internet includes a ‘wildcard’ subdomain entry by default.

The entry appears as * in the A records section of the DNS management:

This means that your website, by default, will resolve to any subdomain at all.

Whilst Google won’t index of these automatically it means that anyone can cause a duplicate version of your site to rank for any subdomain they like. If your site uses relative links then an entire crawl of your fake subdomain site becomes possible rather than just an individual page being an issue.

It’s scary because it not hard to find lots of sites which are hosted with Heart Internet either by using tools like Who Is Hosting This or by just checking for Heart Internet’s own ‘website of the month’ winners which is easy using a search like:

site:heartinternet.co.uk inurl:website-of-the-month-winner

From here you could pick a random winner, test if a random subdomain resolves correctly and then link to it causing it to become indexed creating duplicate content and potentially huge headaches for the site owner which unless they are very SEO savvy won’t necessarily ever discover.

How To Find If Your Site Is Affected:

To find if your site is resolving subdomains which it shouldn’t you can use this search phrase:

site:domain.co.uk -site:www.domain.co.uk

This will show you all the subdomains on your site which are indexed. If any of them are real then just remove them from the search by removing the subdomain to your search like:

site:domain.co.uk -site:www.domain.co.uk -site:realsubdomain.domain.co.uk

Here is an example I found :

https://www.google.co.uk/search?q=site%3Atemples.co.uk+-site%3Awww.temples.co.uk

Fixing the Problem

  1. Remove the * A Record Entry
  2. Run the queries above to determine if you do have these wildcard subdomains indexed. Manually create a version of them and 301 redirect them to the site.

Dear Heart Internet – I suggest you email all your hosting clients and get them to check their sites – and remove that default wildcard.

Ps – I am certainly NOT the first to write about this. See Kev Strongs Post on it.

Share and Enjoy:
  • Twitter
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • FriendFeed
  • Sphinn
  • LinkedIn
  • PDF
  • StumbleUpon
  • Suggest to Techmeme via Twitter
  • Yahoo! Buzz

Rishi Lakhani is an independent Online Marketing Consultant specialising in SEO, PPC, Affiliate Marketing and Social Media. Explicitly.Me is his Blog. Google Profile

{ 1 trackback }

What I Have Read This Month – August 2013
August 26, 2013 at 12:37 pm

{ 4 comments… read them below or add one }

Giuseppe Pastore July 31, 2013 at 9:51 am

This was happening to Matt Cutts blog as well, if I’m not wrong.
Jolly char in DNS is really dangerous, anyway, you’re totally right and your fixing suggestions might come in handy for many webmasters.

I’d add if one has been attacked with spam domains, a good fix might be a conditional rewrite of robots.txt; disallow: / and remove from the index via webmaster tools.

Your post, however, also shows canonical is often ineffective (Chris’ site has it and the duplicated domain has been indexed the same)…

Reply

Lee Colbran July 31, 2013 at 11:10 am

Great spot and great advice Rishi. Hope all is good.

Reply

Jem July 31, 2013 at 11:15 am

Whilst I agree with the gist of your post, sometimes the * A record is configured deliberately (just one example: I am currently using it for WordPress multisite and domain mapping) and telling people who don’t know what they’re doing to muck with that could have disastrous consequences…

Reply

Michelle August 28, 2013 at 2:50 pm

One of my sites is hosted on a cheapo $3.95 domain hosting and as it turns out, I do get what I pay for as that site is vulnerable to this! Thanks for pointing this out!

Reply

Leave a Comment

Previous post: Using Google Trends To Plan Ecommerce Merchandising

Next post: Dear Home Base – you are a bit naked in the SERPs