For over a decade, Google has been fighting the myth of the duplicate content penalty. Since I still continue to field questions on it, I thought it would be worth discussing here. First, let's discuss the verbiage:
What Is Duplicate Content?
Duplicate content generally refers to substantive blocks of content within or across domains that either completely matches other content or that is appreciably similar. Mostly, this is not deceptive in origin.
What Is The Duplicate Content Penalty?
A penalty means your site is either no longer listed in search results altogether, or that your pages have been dramatically reduced in ranking on specific keywords. There is none. Period. Google dispelled this myth in 2008 yet people still discuss it even today.
Let's put this to bed once and for all, folks: There's no such thing as a “duplicate content penalty.” At least, not in the way most people mean when they say that.
In other words, the existence of duplicate content on your site is not going to get your site penalized. You can still show up in search results and still even rank well on pages with duplicate content.
Why Would Google Want You To Avoid Duplicate Content?
Google wants a superior user experience in its Search Engine where users find information of value with every click of a search result. Duplicate content would ruin that experience if the top 10 results on a search engine results page (SERP) had the same content. It would be frustrating to the user and it search engine results would be consumed by blackhat SEO companies simply building out content farms to dominate search results.
Duplicate content on a site is not grounds for action on that site unless it appears that the intent of the duplicate content is to be deceptive and manipulate search engine results. If your site suffers from duplicate content issues… we do a good job of choosing a version of the content to show in our search results.
So there's no penalty and Google will choose a version to display, then why should you avoid duplicate content? Despite not being penalized, you may still hurt your ability to rank better. Here's why:
- Google is most likely going to display a single page in the results… the one with the best authority via backlinks and then is going to hide the rest from the results. As a result, the effort put in other duplicate content pages is simply a waste when it comes to search engine ranking.
- Each page's ranking is heavily based on the relevant backlinks to them from external sites. If you have 3 pages with identical content (or three paths to the same page), you may have backlinks to each page rather than all backlinks leading to one of them. In other words, you're hurting your ability to have a single page accumulating all backlinks and ranking better. Having a single page ranking in the top results is far better than 3 pages on page 2!
In other words… if I have 3 pages with duplicate content and each of them has 5 backlinks each… it won't rank as well as a single page with 15 backlinks! Duplicate content means that your pages are competing with one another and could be hurting all of them rather than ranking one great, targeted page.
But We Do Have Some Duplicate Content Within Pages, Now What?!
It's completely natural to have duplicate content within a website. As an example, if I'm a B2B company that has services that work across multiple industries, I may have industry-targeted pages for my service. A vast majority of the descriptions of that service, benefits, certifications, pricing, etc. may all be identical from one industry page to the next. And that absolutely makes sense!
You're not being deceptive in rewriting content in order to personalize it for different personas, it's an absolutely acceptable case of duplicate content. Here's my advice, though:
- Use Unique Page Titles – My page title, using the example above, would include the service and the industry that the page is focused on.
- Use Unique Page Meta Descriptions – My meta descriptions would be unique and targeted as well.
- Incorporate Unique Content – While large swaths of the page may be duplicated, I'd incorporate the industry in subheadings, imagery, diagrams, videos, testimonials, etc. to ensure the experience is unique and targeted towards the target audience.
If you're feeding 8 industries with your service and incorporate these 8 pages with unique URLs, titles, meta descriptions, and a substantial percentage (my gut with no data is 30%) of the content unique, you're not going to run any risk of Google thinking that you're attempting to deceive anyone. And, if it's a well-designed page with relevant links… you may rank well on many of them. I might even incorporate a parent page with an overview that pushes visitors to sub-pages for each industry.
What If I Just Swap Out City Or County Names For Geographic Targeting?
Some of the worst examples of duplicate content I see are SEO farms that take and duplicate pages to each geographic location the product or service works in. I've worked with two roofing companies now that had previous SEO consultants who build out dozens of city-centric pages where they simply replaced the city name in the title, meta description and content. It didn't work… all those pages ranked poorly.
As an alternative, I put up a common footer that listed the cities or counties they serviced, put up a service area page with a map of the region they serviced, redirected all of the city pages to the service page… and boom… the service page and service area pages both skyrocketed in rank.
Don't use simple scripts or replacement content farms to replace single words like this… you're asking for trouble and it doesn't work. If I am a roofer that covers 14 cities… I'd rather have backlinks and mentions from news sites, partner sites, and community sites pointing to my single roofing page. That will get me ranked and there's no limit to how many city-service combination keywords I could rank for with a single page.
If your SEO company can script a farm like this, Google can detect it. It's deceptive and, in the long run, could lead to you actually getting penalized.
Of course, there are exceptions. If you wanted to create multiple location pages that had unique and relevant content throughout to personalize the experience, that's not deceptive… that's personalized. An example may be city tours… where the service is the same, but there's a ton of difference in the experience geographically that can be detailed in imagery and descriptions.
But What About 100% Innocent Duplicate Content?
If your company published a press release, for example, that has made its rounds and is published across multiple sites, you still may wish to publish it on your own site as well. We see this often. Or, if you wrote an article on a large site and wished to republish it for your site. Here are some best practices:
- Canonical – A canonical link is a metadata object in your page that tells Google that the page is duplicate and they should look at a different URL for the source of the information. If you're in WordPress, for example, and wish to update a Canonical URL destination, you can do this with the Rank Math SEO plugin. Add the originating URL in the canonical and Google will respect that your page isn't duplicate and the origin deserves the credit. It looks like this:
<link rel="canonical" href="https://martech.zone/duplicate-content-myth" />
- Redirect – Another option is to simply redirect the one URL to the location you wish people to read and the search engines to index. There are often times that we remove duplicate content from a website and we redirect all the lower-ranking pages to the highest-ranking page.
- Noindex – marking a page to noindex and excluding from search engines will make the search engine ignore the page and keep it out of search engine results. Google actually advises against this, stating:
Google does not recommend blocking crawler access to duplicate content on your website, whether with a robots.txt file or other methods.
If I do have two absolutely duplicate pages, I'd rather use a canonical or redirect so that any backlinks to my page are passed to the best page, though.
What If Someone Is Stealing And Republishing Your Content?
This happens every few months with my site. I find mentions on with my listening software and find that another site is republishing my content as their own. You should do a few things:
- Try to contact the site via their contact form or email and request it be removed immediately.
- If they don't have contact information, do a domain Whois lookup and contact the contacts in their domain record.
- If they have privacy on in their domain settings, contact their hosting provider and let them know their client is violating your copyright.
- If they still don't comply, contact advertisers of their site and let them know that they're stealing content.
- File a request under the Digital Millennium Copyright Act.
SEO Is About Users, Not Algorithms
If you simply keep in mind that SEO is all about the user experience and not some algorithm to beat, the solution is simple. Understanding your audience, personalizing or segmenting the content for greater engagement and relevance is a great practice. Trying to deceive algorithms is a terrible one.
Disclosure: I am a customer and an affiliate of Rank Math.