The A-Z Guide to Kill Replytocom from Google Search Results

If you are running WordPress framework for your blog, you may have heard about the Replytocom thing.

This has been a big issue for many blog owners in the last few years and if you are looking for a good solution on Google, probably you still couldn’t find out. Some posts are outdated and show the wrong way as they have misunderstandings about the Replytocom bot. That’s why I have to write this article although there are a lot of tutorials to fix the issue out there.

Basically, Replytocom is the spam bot requesting pages on WordPress blog and it will append a query string to links of comments on your site. For example, when I do the search “replytocom site:techwalls.com” on Google, they will show up a lot of links like this:

http://techwalls.com/news/set-custom-url-google-plus-profile/?replytocom=1822

replytocom-google

After clicking on “repeat the search with the omitted results included”, thousands of similar links will appear on the result pages. Google spiders consider all those Replytocom URLs as duplicate ones and our blogs could be penalized for that.

2 weeks before writing this article, there were over 10,000 replytocom URLs for my blog although I blocked it with both robots.txt and Google Webmaster Tools. Actually I added the line “Disallow: *?replytocom” to robots.txt file since setting up this blog but it didn’t work well. Many bloggers are recommending this solution but this is a BIG MISTAKE, this could make your situation worse. Do not block Google spiders from crawling the parameter because it could prevent Google from de-indexing spam bots as well.

So, what’s the best way to protect your site from the Replytocom spam bot and fix the issue?

Here is the solution that I find very effective on this blog, just follow the steps:

1. Handling Parameter in Google Webmaster Tools

Google provides the tool to help them crawl our website more efficiently, we can use it to prevent Google from indexing URLs with Replytocom parameter.

In the Webmaster Tools Dashboard, go to Configuration -> URL Parameters. If you haven’t seen Replytocom parameter in the list, click on Add parameter and configure like the picture below:

replytocom-google-webmaster-tools

Set up URL parameter to not index Replytocom URLs

2. Remove Replytocom variables in WordPress

The best way to remove the Replytocom variable is using an option in the Yoast WordPress SEO plugin. Go to the Permalinks section in setting of this plugin and you will see the option.

remove-replytocom-plugin

Replytocom variable is removed from comment links and is replaced with a #comment-anchor.

As Google ignores any string appended after the # mark in a URL, you don’t have to worry that Google would see the link as duplicate to the page which has already been indexed.

After implementing the above steps, the number of replytocom URLs of my blog has decreased by 3,000 and I believe the issue will be completely solved very soon. The traffic to my blog is increasing as well after it was penalized for duplicate content caused by the spam bot.

Update: The number of Replytocom results for TechWalls is now dropping to just 4 URLs. I can say that the problem has been solved in just 3 months after I started following the guide.

replytocom-search

Almost all Replytocom URLs have been killed. There are just a few URLs left.

If your site is being affected by the bot, I recommend you to do some changes as above and don’t block it using robots.txt. Don’t forget to let me know the result after a few weeks.

5 Comments

  • Salman July 25, 2013 at 10:40 am

    Nice tutorial .. But I wonder how do we stop the G bot from indexing the replytocom url’s from other sites that comes as a backlink to our site.

    Let me make it more clear to you…

    I see a lot of replytocom’s as backlinks to my blog. Does it cause harm in any way?

    Thanks

    • Tuan Do July 26, 2013 at 11:33 am

      Hi Salman,
      Yeah there are some ways to disavow backlinks to your blog, you can find this tool in Google Webmaster Tools.
      However, I think it is not necessary as Google won’t penalize your site at all.

  • Jose November 4, 2013 at 6:36 pm

    I actually followed an article in problogger to add “Disallow: *?replytocom” to the robots.txt file. That was one of the biggest mistake I made, the situation became even worst after I have done that.

    After doing the mentioned steps my replytocom urls came down to around 100 from 500+. But the remaining 100 is reluctant to move out from Google since last 2-3 weeks. Do you have any other suggestion I could do?

    Thanks.

    • Tuan Do November 9, 2013 at 2:37 am

      Hi Jose,
      That’s a good sign. Just implement the steps above and forget about it, it will take a while for Google to deindex those pages. All my blogs are completely clean now. :)

  • Aurélien Debord October 21, 2014 at 2:05 pm

    I think handling the problem with webmaster tools is actually the best way.

Post a Comment

Your email is kept private. Required fields are marked *