Duplicate Content Penalty: The Myth, The Reality, and My Advice

For over a decade, Google has been fighting the myth of the duplicate content penalty. Since I still continue to field questions on it, I thought it would be worth discussing here. First, let’s discuss the verbiage:
What Is Duplicate Content?
Duplicate content generally refers to substantive blocks of content within or across domains that either completely matches other content or that is appreciably similar. Mostly, this is not deceptive in origin.
Google, Avoid Duplicate Content
What Is The Duplicate Content Penalty?
A penalty means your site is either no longer listed in search results altogether, or that your pages have been dramatically reduced in ranking on specific keywords. There is none. Period. Google dispelled this myth in 2008 yet people still discuss it even today.
Let’s put this to bed once and for all, folks: There’s no such thing as a “duplicate content penalty.” At least, not in the way most people mean when they say that.
Google, Demystifying the Duplicate Content Penalty
In other words, the existence of duplicate content on your site is not going to get your site penalized. You can still show up in search results and still even rank well on pages with duplicate content.
Why Would Google Want You To Avoid Duplicate Content?
Google wants a superior user experience in its Search Engine where users find valuable information with every click of a search result. Duplicate content would ruin that experience if the top 10 results on a search engine results page (SERP) had the same content. It would be frustrating to the user and it search engine results would be consumed by black hat SEO companies simply building out content farms to dominate search results.
Duplicate content on a site is not grounds for action on that site unless it appears that the intent of the duplicate content is to be deceptive and manipulate search engine results. If your site suffers from duplicate content issues… we do a good job of choosing a version of the content to show in our search results.
Google, Avoid Creating Duplicate Content
So there’s no penalty and Google will choose a version to display, why should you avoid duplicate content? You may still hurt your ability to rank better despite not being penalized. Here’s why:
- Google will most likely display a single page from your site in the results… the one with the best authority via backlinks. Additional results will likely be ignored and not displayed. As a result, the effort put into other duplicate content pages is simply a waste regarding search engine ranking.
- Each page’s ranking is heavily based on the relevant backlinks from external sites. If you have three pages with identical content (or three paths to the same page), you may have backlinks to each page rather than all backlinks leading to one of them. In other words, you’re hurting your ability to have a single page accumulating all backlinks and ranking better. A single-page ranking in the top results is far better than three pages on page 2!
In other words… if I have three pages with duplicate content, and each has five backlinks… it won’t rank as well as a single page with 15 backlinks! Duplicate content means that your pages compete with one another and could hurt all of them rather than ranking one great, targeted page.
But We Do Have Some Duplicate Content Within Pages, Now What?!
It’s completely natural to have duplicate content within a website. For example, if I’m a B2B company with services that work across multiple industries, I may have industry-targeted pages for my service. Most of the descriptions of that service, benefits, certifications, pricing, etc., may all be identical from one industry page to the next. And that makes sense!
You’re not being deceptive in rewriting content to personalize it for different personas; it’s an acceptable case of duplicate content. Here’s my advice, though:
- Use Unique Page Titles – Using the example above, my page title would include the service and the industry the page is focused on.
- Use Unique Page Meta Descriptions – My meta descriptions would also be unique and targeted.
- Incorporate Unique Content – While large swaths of the page may be duplicated, I’d incorporate the industry in subheadings, imagery, diagrams, videos, testimonials, etc., to ensure the experience is unique and targeted toward the target audience.
Suppose you’re feeding eight industries with your service and incorporate these eight pages with unique URLs, titles, meta descriptions, and a substantial percentage (my gut with no data is 30%) of the content unique. In that case, you won’t run any risk of Google thinking that you’re attempting to deceive anyone. And, if it’s a well-designed page with relevant links… you may rank well on many of them. I might even incorporate a parent page with an overview that pushes visitors to sub-pages for each industry.
What If I Swap Out City Or County Names For Geographic Targeting?
Some of the worst examples of duplicate content I see are SEO farms that take and duplicate pages to each geographic location the product or service works in. I’ve worked with two roofing companies now that had previous SEO consultants who built out dozens of city-centric pages where they replaced the city name in the title, meta description, and content. It didn’t work… all those pages ranked poorly.
As an alternative, I put up a standard footer that listed the cities or counties they serviced, put up a service area page with a map of the region they serviced, redirected all of the city pages to the service page… and boom… the service page and service area pages both skyrocketed in rank.
Don’t use simple scripts or replacement content farms to replace single words like this… you’re asking for trouble, and it doesn’t work. If I am a roofer that covers 14 cities… I’d rather have backlinks and mentions from news sites, partner sites, and community sites pointing to my single roofing page. That will get me ranked and there’s no limit to how many city-service combination keywords I could rank for with a single page.
If your SEO company can script a farm like this, Google can detect it. It’s deceptive and could lead to you getting penalized in the long run.
Of course, there are exceptions. If you wanted to create multiple location pages with unique and relevant content to personalize the experience, that’s not deceptive… that’s personalized. An example may be city tours… where the service is the same, but there’s a ton of difference in the experience geographically that can be detailed in imagery and descriptions.
But What About 100% Innocent Duplicate Content?
If your company published a press release, for example, that has made its rounds and is published across multiple sites, you still might wish to publish it on your site. We see this often. Or if you wrote an article on a large site and wished to republish it for your site. Here are some best practices:
- Canonical – A canonical link is a metadata object in your page that tells Google that the page is duplicate and they should look at a different URL for the source of information. For example, if you’re in WordPress and wish to update a Canonical URL destination, you can do this with the Rank Math SEO plugin. Add the originating URL in the canonical, and Google will respect that your page isn’t duplicate and the origin deserves the credit. It looks like this:
<link rel="canonical" href="https://martech.zone/duplicate-content-myth" />
- Redirect – Another option is to redirect the one URL to the location you wish people to read and the search engines to index. Often, we remove duplicate content from a website and redirect all the lower-ranking pages to the highest-ranking page.
- Noindex – marking a page to noindex and excluding it from search engines will make the search engine ignore the page and keep it out of search engine results. Google advises against this, stating:
Google does not recommend blocking crawler access to duplicate content on your website, whether with a robots.txt file or other methods.
Google, Avoid Creating Duplicate Content
If I do have two absolutely duplicate pages, I’d instead use a canonical or redirect so that any backlinks to my page are passed to the best page, though.
What If Someone Is Stealing And Republishing Your Content?
This happens every few months with my site. I find mentions on my listening software and find that another site is republishing my content as their own. You should do a few things:
- Try to contact the site via their contact form or email and request it is removed immediately.
- If they don’t have contact information, do a domain Whois lookup and contact the contacts in their domain record.
- If they have privacy in their domain settings, contact their hosting provider and let them know their client is violating your copyright.
- If they still don’t comply, contact advertisers of their site and let them know that they’re stealing content.
- File a request under the Digital Millennium Copyright Act.
SEO Is About Users, Not Algorithms
If you simply keep in mind that SEO is all about the user experience and not some algorithm to beat, the solution is simple. Understanding your audience, and personalizing or segmenting the content for greater engagement and relevance is a great practice. Trying to deceive algorithms is a terrible one.
Disclosure: Martech Zone is a customer and an affiliate of Rank Math.