Duplicate pages

If website pages are available at different addresses, but have the same content, the Yandex robot may consider them duplicates and merge them into a group of duplicates.

Note. Duplicates are pages within the same site. For example, pages on regional subdomains with the same content aren't considered duplicates.

If your website has duplicate pages:

  • The page you need may disappear from search results, if the robot has selected another page from the group of duplicates.
  • In some cases, if there are GET parameters, pages may not be grouped and may participate in the search as different documents. As a result, they compete with each other. This may impact the website's ranking in search results.
  • Depending on which page remains in the search, the address of the document may change. This may affect, for example, the reliability of statistics in web analytics services.
  • It takes the indexing robot longer to crawl the website's pages, which means the data about pages that are important to you is sent to the search database more slowly. The robot can also create an additional load on your website.
  1. How to determine if your website has duplicate pages
  2. How to get rid of duplicate pages

How to determine if your website has duplicate pages

Duplicate pages appear for a variety of reasons:

  • Natural. For example, if a page with a product description is available in several categories of an online store).
  • Related to the site features or its CMS.
To find out which pages are excluded from the search because of duplication:
  1. In Yandex.Webmaster, go to the Searchable pages page and select Excluded pages.
  2. Click the icon and select the “Deleted: Duplicate” status.

You can also download the archive. To do this, choose the file format at the bottom of the page. In the file, duplicate pages have the DUPLICATE status. Learn more about statuses

If the duplicates were created because GET parameters were added to the URL, a notification about this will appear on the Troubleshooting page in Yandex.Webmaster.

Note. A duplicate page can be either a regular site page or a fast version of it, such as an AMP page.

How to get rid of duplicate pages

To have the right page in the search results, indicate it for the Yandex robot. This can be done in several ways depending on the URL type.

Example for a regular site:

http://example.com/page1/ and http://example.com/page2/

In this case:

Example for a site with AMP pages:

http://example.com/page/ and http://example.com/AMP/page/

In this case add the Disallow directive in the robots.txt file to prevent the duplicate page indexing.

https://example.com and https://example.com/index.php

In this case:

http://example.com/page/ and http://example.com/page

In this case, set up a 301 redirect from one duplicate page to another. In this case, the target of the redirect will be included in the search results.

http://example.com/page/, http://example.com/page?id=1 and http://example.com/page?id=2
In this case:
http://example.com/page?utm_source=link&utm_medium=cpc&utm_campaign=new and http://example.com/page?utm_source=instagram&utm_medium=cpc

In this case, add the Clean-param directive to the robots.txt file so that the robot ignores the parameters in the URL.

The robot learns about changes the next time it visits your site. As soon as that happens, the page that shouldn't be included in the search will be excluded from it within three weeks. If the site has many pages, this may take longer.

You can check that the changes have come into effect in the Pages in search section of Yandex.Webmaster.

If you followed the above recommendations but the changes didn't affect the search results after three weeks, fill out the form below. In the form, specify the sample pages.

Pages with different content can be considered duplicates if they responded to the robot with an error message (for example, in case of a stub page on the site). Check how the pages respond now. If pages return different content, send them for re-indexing — this way they can get back in the search results faster.

To prevent pages from being excluded from the search if the site is temporarily unavailable, configure the 503 HTTP response code.