Pretty damning evidence shows Microsoft's Bing search engine might be directly copying some of Google's more obscure search results .
Google grew suspicious when searches for misspelled words started to return the same corrected spelling top results on Bing as Google had been doing. One glaring difference: Google's page says "did you mean…" and suggests the correct spelling whereas Bing – which does have its own spelling suggestion feature -- didn't. Moreover, the rest of the search results on that page were indeed for the misspelling.
To confirm their suspicion Google setup a rare experiment where they hardcoded highly unlikely honeypot pages to come up #1 for very, very, very obscure searches.
Once a handful of Googlers started to preform these searches from their homes using Internet Explorer with the Suggested Sites feature enabled and the Bing toolbar installed, the same results started to appear on Bing.
THE ACCUSED
Bing doesn't deny that it is using toolbar data and other signals to help its search engine be better. So does Google which uses the data to fight spam or gauge sites' speed (note: speed is a ranking factor and a Quality factor in Adwords), among other things.
One use for toolbar data – among all search engines -- is likely to match queries performed with time spent on the destination site. That kind of data is especially useful inverse to the quantity of searches and results for those searches.
Or: the rarer the search, the LOUDER this signal is.
And that's Google's problem. The searches they deliver as evidence are extremely rare. So rare that some of them gave no results or just a couple of results. As any SEO can tell you, with that small a data set (no results or just a few), tests can show some pretty wild and exciting stuff that never ever translates into the real world of searches with thousands of results…
No matter where the searches would have been performed, "never seen search" data with "time on site indicates good match" data would almost certainly trigger a clamping kind of behavior in any search engine that essentially has no such searches and none such results…
THE REAL TEST
I'm excited for Danny Sullivan's scoop. If it were me I would have ran with it ASAP and milked it for everything that it's worth. I know I should be all professional about it but that's the kid in me, you know? So I'm not here blaming him but … a balanced view would have him ask Bing "out of the blue" to perform a similar test. That would have determined if by chance Google too is listening to its toolbar data and if it too could be tricked into listing nonsense if the query space is tiny enough and the result set non-existent…
When you think about it, there’s probably a very large percentage of smaller search engines out there that simply feed off of Google’s results (and index Google’s pages themselves). Bing is just following the conga line.
“small a data set… tests can show some pretty wild and exciting stuff that never ever translates into the real world”
So what did Google really prove? That Bing’s toolbar scraps Google for content? Hmmm… Isn’t imitation the greatest form of flattery?
I’m not even sure it shows that. If for a virtually non-existent query (sent to Bing, as they disclose) a user spends a certain amount of time on a page resulting from that query (URL of which is sent to Bing, as they disclose) and you repeat that a few times, you could trigger a filter that starts to promote that page for that query — without scraping anything.
It’s like when you repeat a unique word or phrase many times on a page, do a search and once that page “ranks” #1 out of 3 results returned claiming keyword density is something real.
With very small datasets you get inconclusive results unless you cross-reference like crazy. It’s the reason why long tail spam works: unique queries with a small result set play with a different set of rules (filters) than larger sets.