An effective way for GOOG to punish scrapers?

mooism2 · on Jan 6, 2011

Determine the canonical source.

Well, that's the hard bit, isn't it? What are the consequences of getting it wrong? If Google bans my site from their index because it thinks I stole my content from a scraper, that's going to be hard to take.

eof · on Jan 6, 2011

Not really. Sure if they auto-ban you it sucks, but how hard is it to post something, inform google of it and wait x hours for it to show up on some autoscraper site. (of course this requires google cooperation).

Edit: it seems like a good solution from google POV would be to inform people of their impending banning and give them the chance to defend themselves by posting original content that then gets scraped elsewhere.

Travis · on Jan 6, 2011

Your edit is interesting. Google seems to actively reject the idea of doing things that aren't automatic. That's why you hear the horror stories of people getting blocked from adwords (frequently with google taking their balance back, too). Until they were legally threatened, google didn't have great tools for youtube copyright notices, either.

They are a company that does not like to do things that require 2 way communication, because (in my view) it doesn't scale. My guess is that they are focused on revenue/employee, and adding a call center will decrease that significantly.

mooism2 · on Jan 6, 2011

Surely it's not that hard for a scraper to post something original, inform Google, wait X hours, then plant it on StackOverflow/wherever?

meric · on Jan 6, 2011

If the parasite kills its host...

mooism2 · on Jan 6, 2011

A parasite that kills its only host didn't have enough hosts.

sambeau · on Jan 6, 2011

Not all scraping is bad: some provide extra services such as search.

Google itself is a scraper.