About a year or so ago, there were some discussions regarding whether or not scrapers could be hurting legitimate sites' rankings. One such thread is here and blog posts are here and here. Even the Washington Post got a word in. It seems as though the main consensus was that scrapers would not hurt you in the rankings, although different people had different opinions.
I wonder if perhaps the issue should be raised again, based on the information we've received from Matt Cutts in his Big Daddy Indexing Timeline. A couple of things he mentioned were:
Its true that if you had N backlinks and some fraction of those are considered lower quality, wed crawl your site less than if all N were fantastic.
Off-topic links wouldnt cause a penalty by themselves. Now if the off-topic links are spammy, that could cause a problem.
I bring this up because I'm watching one of my sites get more and more scraper backlinks every day. It would be impossible for me to obtain enough natural backlinks (or even "arranged" backlinks) to outweigh the scraper backlinks. Could these scraper links be hurting me? They are certainly "spammy", I would think. No doubt that they are considered "lower quality". And if a bot were to measure them against the total backlinks, it would probably determine that my site has many more low quality backlinks than high quality backlinks.
In addition, nearly all of these scraper links use the same anchor text, which could in itself cause some sort of "too similar" penalty. We usually believe that our anchor text should be natural and varied. Are these scrapers, using exactly the same anchor text, causing it to seem as though I am creating an unnatural backlink pattern?
Perhaps, a year ago, this situation wasn't causing a problem. But now? With Big Daddy? Could scraper sites be causing harm now? I hope not, but I think it's possible. And unfortunately, if true, there's nothing any of us could do about it. My hope is that Google simply disregards those scraper links, and doesn't use them positively or negatively against a site. But I'm a bit worried that the algorithm isn't quite that bright. Ah well, just something to think about, as I adjust my tinfoil hat.
I wouldn’t worry too much about it, Donna.
You can’t get punished for something you can’t control. Besides, if your site has that many scraped links, they have to be scraped from something of some use.
Hell, I find backlinks to the various sites I build from all sorts of stuff (my favourite was a porn site that was subdomain-spamming me and Matt Cutts was nice enough to clip the results for me.)
That is a good question.
I hope someone will ask Matt Cutts about this.
The clean solution is for Google to somehow recognize the site where the content originated, and simply filter it out of the rest. But that would require sort of a race between the Google crawler and the content stealers… uh, aggregators. I don’t think the backlinks will hurt though, because they are non-reciprocated.