How Google Determines Duplicate Content from the Original Content?
Everyone on the Internet is speaking about how Google Panda/Penguin is penalizing the duplicate content and steps to recover from it. Many of us think duplicate content means copying or plagiarizing the work of others. But, the reality is entirely different from what we think! Let us not decide what is duplicate content, in-fact, see how does Google decide what duplicate content is?
Google Panda is actually an algorithm developed by the programmers in Google based on the following set of queries in mind.
- Does the website have too many advertisements? Whether they’re above the fold or below?
- Is the content relevant and valuable to the readers?
- Would you trust using your credit card on this site?
- Does the site hosted content that is entirely copied from others website?
An interesting fact here is that there is more than one answer to each of the above questions from the users perspective.
Ex: Lets take two friends who weight 80kgs in weight; Now, assume all people who are 80kgs in weight are good people. So, Is Osama Bin Laden, a good person? Oh No! Assume all people who weigh 80kgs and have a beard are bad guys. So, Is Abraham Lincoln, a bad person?
This simple example proves that algorithms can’t be defined based on the standards. They should consider a lot of other variables as well.
As per SEO Expert Leslie Rohde, here is how Google decides what duplicate content is and isn’t? Google doesn’t actually analyze the entire web page, it just look at the snippet and that’s the main reason why many websites have been penalized in the Google Panda update process although there is nothing wrong done by the site owners.
What are Snippets?
Snippet is nothing but the title and description which is displayed in the SERPs (Search engine result pages).
People usually enter meta information for every page with an intention of optimizing their website for search engines, but the actual truth here is – Google takes (crawls) snippet mostly from wherever it likes on your web page (what it feels relevant). This itself means that if you’ve copied someone else’s work, although if its a reference, your page content will be treated as duplicate.
Say for example, you took reference from a Wikipedia article and wrote your views based on that quote and unfortunately Google has taken that quote and displayed as a snippet in SERPs, then obviously your web page will be treated as copied content and the original content credits will be given to Wikipedia.
If you fall under this criteria, your website will be left with a black mark. The more black marks your website accumulate, the more likely it is that your website will be (is) affected by Google panda or Penguin algorithm update.
Share This Article
Get Free Email Alerts
And, Next Confirm Your Email Subcription