How to avoid the duplicate content penalty when using a Wordpress blog

There’s no denying it, Google is the powerhouse of the internet; more than capable of showering targeted visitors on your website like a torrential downpour backed by gale-force winds. That is, if you happen to be in Google’s good graces and rank highly for popular keywords.

Bloggers often times find themselves in the position to receive great traffic from Google simply because of the way that blogs work; namely, they allow for quick inclusion into major search engines even when using the default setup. Being the most popular blogging software on the planet, Wordpress blogs are the content management system (CMS) of choice for bloggers due to ease of use and the large community of supporters that help to develop plugins for this open source blogging software.

However, just because Wordpress blogs can help websites to get noticed by the major search engines easily, this doesn’t mean that they are inherently search engine friendly; in fact, the default settings for Wordpress blogs almost guarantee that, if left untouched, your blog will end up suffering from the duplicate content penalty.

The duplicate content penalty and Wordpress blogs

The duplicate content penalty is a term that is used to describe what happens when a web page is removed from the primary search results for a certain keyword phrase due to identical content elsewhere on the internet. In Google, the lower-ranking websites and individual pages that contain the duplicate content are hidden, and this phrase is displayed instead:

In order to show you the most relevant results, we have omitted some entries very similar to the X (number) already displayed.

This happens primarily in two situations; with article marketing, A.K.A. article directory submissions, and when you post content to an unmodified blog. Since this article is focused on Wordpress blog optimization, we’ll focus on how you can avoid the duplicate content penalty for the latter.

WidgetReady

Unmodified Wordpress blogs are search engine “unfriendly”

By using a Wordpress blog with its default setup, you are creating an atmosphere where your blogs’ content is almost guaranteed to suffer from the duplicate content penalty. The reason for this is simple: By default, your Wordpress blog will have the exact same content, word for word:

  1. On the blogs main page
  2. Within the blogs RSS feed
  3. On the category page
  4. On the monthly archive page
  5. On the posts unique page

Clearly, having 5 different instances of the exact same content within a single website will undoubtedly lead to your content being seen as “duplicate” content by Google, and other major search engines for that matter.

Avoiding the duplicate content penalty with a Wordpress blog

There are a couple of things that can be done with your Wordpress blog that will help you to keep your blog in the primary search results:

Limit the text shown on your blogs pages

The duplicate content penalty comes about due to large amounts of significantly identical text being shown on numerous pages throughout the internet, not a few characters or even a couple of sentences. Wordpress blogs allow bloggers the option to use what is called the “more” tag, where they can limit the amount of text that is displayed on the blogs main page, the category pages and the archive pages. By utilizing this tag, Wordpress users are limiting the content for a post to be displayed, in full, only on the posting page itself; thus removing four of the five aforementioned instances of locations of duplicate content when adding a post to their blog, which can help to decrease the chances of being affected by the duplicate content penalty.

However, if you have an established blog – it would be a cumbersome chore to edit every existing post, not to mention it’s another step you would have to endure to use this option for all future posts. Fortunately, there is a Wordpress plugin that will replicate the effects of the “more” tag, yet not requiring you to manually add it to your blog posts.

Evermore Wordpress Plugin

The evermore Wordpress Plugin will replicate the effects of the “more” tag, effectively limiting the amount of text shown on your blogs main page, category page and archive pages. You have the ability to specify the character limit, so you can have more or less text displayed in these other areas depending on your needs. Installing this plugin is as easy as uploading a file via FTP to your Wordpress plugin directory, and the effects can be reversed completely by deactivating it.

Choose “Summary” for your blog’s RSS feed file

Another way to help stop the chances of becoming the next victim of the duplicate content penalty is to simply change the settings from within your Wordpress blogs administration area.

Once you’re logged into your blogs administration area, go to Options>>Reading. Under the sub-heading “Syndication Feeds”, select the “summary” option. This will effectively cut down your chances of suffering from the duplicate content penalty.

Using the robots.txt file to keep Google (and other search engines) away

Now, of course you want search engines to crawl your blog, but in some cases it’s not in your best interest to have all of your pages crawled and indexed by all search engine robots. Another trick to helping you avoid the duplicate content penalty is to tell the Googlebot to stay away from your RSS feed. Some bloggers even go so far as to disallow the Googlebot from crawling their blog archives and category pages, but that is a personal preference; in all honesty we have yet to see conclusive evidence as to which approach is best. Here is an example of a set of commands that will help you to keep the Googlebot from accessing your blogs feeds.

Sample robots.txt file to disallow the Googlebot from your RSS feed

(if your Wordpress blog is at the root of your domain)


User-agent: Googlebot
Disallow: /feed/$
Disallow: /feed/rss/$
Disallow: /trackback/$

You can check out some advanced commands for your Wordpress blogs robots.txt file on the Ask Apache website. As with any modifications of this type, you should implement changes with extreme caution as improperly forming commands to your robots.txt file may cause it to be excluded from search engines all together!

Using the tips and tricks outlined here, you can help keep your blog in Google’s primary index and avoid the duplicate content penalty that many other bloggers suffer from. It goes without saying that one of the major contributing factors to the duplicate content penalty is syndicating articles from article directories; and for best results it’s always suggested that your blog is comprised of unique content rather than syndicated content.

This article is a 2 part series on how to optimize your Wordpress blog for Google. In our next article, we will show you how to optimize your Wordpress blog for ultimate search engine friendliness and also how you can pull your blog out of Google’s supplemental index as well as covering ways to stay out of it.