Duplicate content is always a problem. Sure it may not seem like a big deal and it can get you over the hump on those days when you don’t have the time to create original content, but posting duplicate content always comes back to bite you in the end. In most cases, you are better off not posting rather than posting duplicate content.
Here is what Google officially suggests to webmasters in terms of dealing with duplicate content issues:
Identify the pages that are with duplicate content for starters. In order to do this you need to use the site: query in Google. This will list all the duplicate content in your website. Once you have these results you can begin getting to work.
For the next step you need to choose your preferred URLs and tell Google what your URL hierarchy is. It is not enough to choose your preferred URLs. It is up to you to make sure to use only your preferred URLs whenever you need to refer to those pages. You should use the same preferred URLs even in the sitemaps to make sure that you have all your ducks in a row. Google is vehement about their insistence that you be consistent all through your site. Inconsistency
confuses the search engines. As big a problem is that it will also confuse the users who are trying to find some information in your website which detracts from a positive user experience.
In order to get past this issue you need to now make use of permanent redirects using 301 redirects on pages that are duplicated which is a bit of a pain. These pages must be redirected to the required preferred pages. This will ensure that your website visitors are always on the right pages despite the URL they use to access a particular page that has been duplicated.
Here;s the problem with this though. It may not be possible always to use 301 in all situations in such situations Google recommends webmasters to make use of the rel=”canonical” link element. This feature is also supported by Bing, Yahoo and Ask.com.
Some webmasters try to block access to duplicate pages using the Robots.txt file. It is a nice idea, but Google suggests not to make use of this method to keep the search engine bots from crawling the duplicate pages. Google prefers that the webmasters use the rel=”canonical” link element whenever possible. If you completely block access to those duplicate pages, Google will consider these pages to be separate pages and the best solution is to allow them to crawl but to redirect it to the actual pages using any one of the methods discussed above except using robots.txt.