Follow us on Facebook

Why Google Decided to Filter Out Duplicate Pages

Google is undeniably the arbiter of the internet. Its search algorithms determine which sites rise to the top and which sink into obscurity. This means that many businesses spend large amounts of money on paying an SEO company to adjust their websites in order to help it rank higher on Google for relevant keyword searches. Recently it emerged that the programming wizards at Google had worked out how to filter duplicate pages from so-called bounce pad sites so that they could provide users with more accurate and less cluttered search results. What this means for site owners and Google fans is deserving of analysis.

It is in the interest of Google to reduce the number of pages which it lists if a site has a number of duplicated documents in its index, or indeed if there are multiple sites sharing multiple copies of the same basic information. Google is able to identify which sites fall into this category, group them together in a cluster and then only present one page pulled from this sample to show to searchers.

There can be a large number of reasons why information gets duplicated, many of which are not the result of duplicitous motivations. If a site has to list a lot of manufacturer specifications, for example, it could end up being categorised as bounce site and clumped together with other similar sites, obscuring it from view. Also, if you use a unified content-management system to simultaneously publish the same text across multiple sites, even if they have different URLs, you could end up with the same results.

A Google patent application shed some light on the issue of bounce pad sites. The search engine will generally consider these as sites used by spammers because of the high number of redirects and the significant duplication of content. If the content has been copied wholesale from a legitimate source, then Google and the other search engines have an even tougher time. They have to identify its perfectly acceptable origin while blacklisting bounce pad sites which have simply pilfered it unscrupulously. Of course, indexing the duplicate sites is a problem in itself because it takes up space and further obfuscates the issue.

One way for website managers who have a legitimate presence on the web to avoid getting blacklisted by Google or slowing down the experience for average users is to minimise the number of redirects which are present. Internal redirects can be particularly problematic, as any SEO company will tell you, because a multi-tiered redirect set-up will leave search engine crawling programs trawling through level after level and could hamper the ranking as well as access to the site for visitors. Google has identified six different categories of redirects that automatically send the user to the intended target address. These include manual link-based redirects embedded in text as well as frame redirects and even full-screen pop-ups.

Google’s methods for determining bound pad sites or spam sites are still automated and based on an ever-evolving set of rules. If one legitimate source is identified as having been copied by a suspected bounce pad site, then the latter will be removed from the results. With the proper help and guidance it should be relatively simple to avoid the pitfalls associated with Google’s search algorithms and make your site as visible as possible.

No comments:

Post a Comment



Like Us


Donate us For Furture