Most of us that aren’t extremely technically minded tend to fall into a series of errors that our more savvy friends would be able to help us out of. And if we, those who are in the web / online marketing industry get caught out, then I wonder what happens to the less savvy population?
Considering the fact that Google is getting a lot more aggressive with duplicate content – imagine my amazement when Chris at Hitreach (Webdev and SEO in Dundee who I work with on occassion) told me about a recent clean-up he had to do. And how it came about. (Note: When I say “more aggressive” I mean that I am seeing more an more sites in the last 2 years getting hurt by too much dupe content. A FULL site duplicated isn’t ideal from any standpoint. )
The Background – Rogue SubDomains
Whilst undertaking a site audit on their own site they began discovering rogue URL’s which were indexed in Google such as http://w.hitreach.co.uk
This isn’t a subdomain which actually existed which meant someone who’d incorrectly linked to www.hitreach.co.uk had caused a rogue subdomain to become indexed.
What are the issues with such subdomains? Well to start off with, they create a complete clone of your website. And with a single link, CAN be indexed in Google overnight.
I know, cause I tried. I managed to get 10 variations of a random subdomain name of a site hosted at the same hosting provider to index within 24 hours. That is 10 duplicate versions of the site overnight.
As an SEO, you can imagine my horror at thinking how easy it would be to:
- Index hundreds of variations of a sites clones on sub domains with a simple high volume SENuke attack.
- Create dodgy subdomains with high risk words, adult, pharma, gambling etc.
- Potentially OUTRANK the home page with those dodgy subdomains, and causing the site serious ranking issues.
Heart Internet – Default Settings Leave Clients at Risk
It turns out that any reseller or Hybrid/VPS hosting account which is created on Heart Internet includes a ‘wildcard’ subdomain entry by default.
The entry appears as * in the A records section of the DNS management:
This means that your website, by default, will resolve to any subdomain at all.
Whilst Google won’t index of these automatically it means that anyone can cause a duplicate version of your site to rank for any subdomain they like. If your site uses relative links then an entire crawl of your fake subdomain site becomes possible rather than just an individual page being an issue.
It’s scary because it not hard to find lots of sites which are hosted with Heart Internet either by using tools like Who Is Hosting This or by just checking for Heart Internet’s own ‘website of the month’ winners which is easy using a search like:
From here you could pick a random winner, test if a random subdomain resolves correctly and then link to it causing it to become indexed creating duplicate content and potentially huge headaches for the site owner which unless they are very SEO savvy won’t necessarily ever discover.
How To Find If Your Site Is Affected:
To find if your site is resolving subdomains which it shouldn’t you can use this search phrase:
This will show you all the subdomains on your site which are indexed. If any of them are real then just remove them from the search by removing the subdomain to your search like:
site:domain.co.uk -site:www.domain.co.uk -site:realsubdomain.domain.co.uk
Here is an example I found :
Fixing the Problem
- Remove the * A Record Entry
- Run the queries above to determine if you do have these wildcard subdomains indexed. Manually create a version of them and 301 redirect them to the site.
Dear Heart Internet – I suggest you email all your hosting clients and get them to check their sites – and remove that default wildcard.
Ps – I am certainly NOT the first to write about this. See Kev Strongs Post on it.