Dropping out of Google
Around mid december, I noticed that traffic for my Left 4 Dead Servers page dropped sharply – from 100 unique visitors a day to 5, at most.
Most of my traffic comes from search engine referrals, so to have a drop off like this must have something to do it Google.
Cue Google’s Webmaster Tools, a handly place for you to view all statistics about your site. On their diagnostics section I was getting a lot of “Network Unreachable” errors on the robots.txt. Basically, Google couldn’t access my site to read the robots.txt – nothing wrong with the actual robots.txt at all, it was a web server problem. After having a look around, I came across some information on diagnosing the issue, no help from Google itself on this matter of course. The information listed here indicated that certain software was blocking google for making too many requests the to the server, in a DoS attack fashion. This site is hosted on reseller hosting, however it shares the machine with a lot of other websites. In which case, Google has been accessing many sites on this server generating a high number of requests. Therefore, the software on the machine automatically blocked Google falsely beleiving that it was attacking sites on the machine.
This blog was also hosted on this reseller account, and traffic figures showed a similar decline. I’ve recently bought a dedicated cPanel server to replace the reseller and moved the blog onto there – bingo! This blog now appears back in search engine listings. If we look at Google’s Webmaster Tools, we can see the amount of pages crawled by it.
Right after moving it off the reseller hosting and onto the server, loads of pages are instantly and successfully crawled. The blog is back in the Google index.
www.l4dservers.net is still pending move to the new hosting, but I expect to see similar results once I have completed the move, just waiting on the DNS to be updated for it.
So what can you do if you experience this issue? First, read this excellent article and make sure that your lack of Google listing isn’t your fault. If you are not at fault, contact your hosting provider and point them to the article, ask them if they have any firewalls that block multiple requests such as the Googlebot. I would also highly suggest signing up to Google’s Webmaster Tools to help diagnose any problems Google has accessing your site. While they where not particually helpful in this incident, it did help to get me started on the diagnosis. Not only this, but it allows you to submit sitemaps which vastly improve your ranking and help make sure that every page is in Google’s index. Sitemaps are checked very often and any new URLs are quickly crawled helping you get any content into Google quickly.
From the more technical side, this article might be of interest. It details how ConfigServer Firewall might be a culprit in this.