What is duplicated content? It's basically some page you have on your website that can be accessed with different URLs. Let's say my first blog post were accessible from 3 different urls:
How will search engine react? First of all, search engines dislike duplicate content and fight it as much as possible: do not show them on search results, decrease page ranking or in some cases ban page entirely. When crawler encounters duplicate content it will try to make decision by itself which one should appear in search results. But it is our job as web developers to ensure that crawlers don't have to encounter these situations at all.
So how do we prevent duplicate content?
The best way to fight duplicated pages is with permanet redirects. Say for example I have moved page
Since I don't want to lose page ranking I have gained and I don't want to have duplicate pages, the first URL should have permanent redirect to second one.
If you can't implement permanent redirect for some reason canonical links are great solution. Taking example below, implementing canonical link in old page I would have to put
<link href="http://seo-insights.herokuapp.com/seo/2014/10/06/what-is-seo.html" rel="canonical" />
<head> tag. Great! We've just informed crawlers that the page is duplicate and redirect all traffic to new URL instead.
Special meta tag
<meta name="robots" content="noindex, nofollow">
Will inform robot to not index the page and not follow any inbout/outbound links located within.
It's worth to mention one important thing - when talking about duplicate content it's not always about duplication within your website. If your webpage content is taken from other website expect trouble. Search engines dislike people stealing other peoples content. If you get busted - you might get your page removed from index permanently. How to fight it? Create original content!