Dark Google: One Year Since Search Terms Went “Not Provided”
A year ago, Google began going dark. Dark in terms of no longer sharing with publishers, in some cases, how people searched for and found those publishers through Google’s search engine. The “single digit” percentage of withholding that Google predicted at the time has turned into more than 50%, in some cases. If Google’s withholding were an eclipse, more than half the sun is being covered. “Dark Google” is upon us, and it will only grow darker.
I’m drawing the term “Dark Google” as a play off the “Dark Social” concept that Alexis Madrigal wrote about recently in The Atlantic. Dark Social refers to social visits that can’t be attributed to any particular social network, such as if someone shares a link with a friend via email. That’s arguably a “social” event, a socially-related visit, but not one that can be attributed to any particular source.
In a similar fashion, Dark Google refers to visits from Google’s search engine that can no longer be tied to a particular search term. This withholding of search data began a year ago (and a day) and has continually expanded since.
Passing Along Search Terms
First some background and history, to understand this huge change. I promise to keep it brief.
When people search on a search engine, typically the search terms they used to find a web site are passed along to the publisher.
For example, if someone did a search on Google for “dvd players” and clicked on a listing from Best Buy from among those in the results, Best Buy would be able to tell that visitor found them through Google by searching for the words “dvd players.”
This is all a consequence of how our browsers have worked since before Google existed. Browsers pass along what’s called a “referrer,” which is kind of like a Caller ID system for the web (and sometimes spelled “referer” due to a historic misspelling).
To understand more about how this Caller ID system works, I recommend reading the article below:
Google Begins Blocking Terms
This time last October, Google made a change to block this Caller ID system for anyone who was signed-in to Google when they searched. Why? Google said that it was designed to protect privacy. My article from last year covers this more:
Google was correct. This change did protect potential “eavesdropping” by others of what someone was searching for. However, Google deliberately left a hole in this privacy protection. Anyone clicking on its ads still had their search terms left vulnerable to eavesdropping.
Privacy Hole For Advertisers
Google has given, in my opinion, a fairly convoluted explanation about why this gap in privacy protection was left open. It’s said that potentially an advertiser could buy so many different ads that it could still see some of the search terms that Google’s privacy protection was meant to secure. Oh, and Google also says advertisers need the data to better know if their ads work.
My view is the search-side of Google wanted to better protect people from eavesdropping, especially in advance of Search Plus Your World, which potentially would expose more personally-revealing search terms. But the ad-side of Google demanded an exception so that advertisers wouldn’t be upset nor Google’s ad retargeting business harmed. The bottom line won. A privacy hole was left open for advertisers.
Suffice to say, Google and I don’t see eye-to-eye on this issue. Google finds it to be a reasonable compromise that brings greater security overall for searchers. I find it one of the most disturbing and hypocritical things the company has ever done.
As a marketer, I love search referral data, but I also understand having to lose it to better protect privacy. But if it’s about privacy, then Google shouldn’t leave a loophole that puts advertiser interests over that of its users. My article below has more about this:
The “Not Provided” Eclipse Begins
When the blackout began, Google predicted that for searches on Google.com, data would be withheld in the “single digits.” In other words, less than 10% of the search term data reported by Google.com to publishers would get withheld.
Those using Google Analytics were able to spot when this was happening because “not provided” started appearing as a search term in their reports, like this:
What was happening is that Google would report that a search happened but strip out the actual search terms. Any time Google Analytics saw one of these “blank” searches, it counted it as a “not provided” search (other analytics programs used different ways and methods for the same end result).
Those searches began to add up. Within two weeks after launch, I found sample sites where up to 14% of keywords were being withheld.
The darkness kept growing, nor was it some type of niche SEO issue. In April of this year, Poynter — a major site about journalism — found that 29% of its search term data had gone dark and that “not provided” was its top search term.
Today, that’s sure the case I see. By far. Consider the top five search terms sending traffic to my personal blog, Daggle, this month:
Not provided isn’t just my top “search term” but it’s also about 150 times the volume of the next biggest term, “3 monitor setup.” It’s crazy: 3,654 search terms blacked out, all bundled together as “not provided” with the next most popular term coming in with a count of 24.
The Darkness Grows
The numbers have risen because of a variety of factors. For one, as Google has continued to grow its Google+ social network, it has encouraged people to sign-in as much as it can. In fact, a new study found that people are three times more likely to be signed-in when they search on Google than on Bing. All those signed-in searches have keywords withheld. Except for Google’s advertisers, of course.
But Google SSL Search — the method Google uses to protect signed-in searchers — can be used even by those who aren’t signed in. In July, Firefox started using Google SSL Search by default, figuring that would be a good privacy move (despite the advertiser loophole). Overnight, a huge chunk of search terms got withheld.
In September, Google was completely caught off-guard when Apple made a similar move to use Google SSL Search for those searching from within Safari in iOS 6. As it turns out, Google SSL Search for mobile users (pretty much) passes no information at all to publishers, whether they are advertisers or not.
Publishers aren’t even sent “blank” searches so that they can at least tell someone was a search-related visitor. Instead, all these Safari searchers appear to publishers as visitors who came directly to their web sites.
How Much Withheld?
So where are we at? I haven’t seen any broad-based metrics recently (Conductor did a study in March finding 16% across 25 different sites), but stay tuned. I suspect we’ll see people sharing some of their own stats in the comments below, after this column is published. In the meantime, here are some stats from sites I have access to, for search traffic for the current month so far:
- A friend’s entertainment blog: 18% withheld
- My personal blog Daggle: 45% withheld
- Search Engine Land: 60% withheld
- Marketing Land: 62% withheld
Those figures are actually lower than what’s actually being withheld. That’s because many searching through Safari in iOS 6 are instead reported as “direct” visits, as I explained earlier. Consider this for my friend’s site:
Most people using iOS 6 appear as if they came directly to the site when actually a big chunk of those visitors probably came through Google searches. That’s especially true when you compare the stats to iOS 5 users, who aren’t routed by Safari through Google SSL Search. Direct is not the top source for them and far less a percentage in relation to “Google” traffic:
Finally, here’s how terms being withheld grew over the course of the past year, looking week-by-week at the number of “not provided” terms reported for our Search Engine Land site:
What Will Year 2 Bring?
There’s every reason to suspect that even more search term data will be lost going forward. Google’s field trials to test finding Gmail results or Google Drive results alongside regular search results means even more people will be signed-in when they search, so even more search referrers will get withheld. Meanwhile, with Firefox and mobile Safari providing secure searching, Google will feel more pressure that searching in Chrome should be made secure, by default.
Referrer data has been one of the things that has made internet marketing so accountable. Sadly, signs of it dying away have been out there since 2010, and it’s just getting weaker. It’s continuing to take body blows on the search front, and I suspect those will get worse both for search referrers and in general.
On the upside, there’s still enough data that’s not withheld to give publishers a sense of important terms people use to find their sites. Google also provides query data through Google Webmaster Central that can supplement what’s been lost. You can’t use that data to tailor your pages for visitors who arrive after performing certain searches, but at least you still have a sense of what you’re ranking for. More about gathering this data can be found below:
- Google Webmaster Tools Adds Useful Download Options
- Google Webmaster Tools Expands Query Data to 90 Days
- Google Webmaster Tools Adds Download To Google Spreadsheets
Can I Haz More Than 90 Days?
Then again, Google’s only providing that data for the past 90 days. Unless you’re constantly downloading it, you can’t see trends over time. I’ve desperately wished that Google would expand this data for a longer period of time. Google Analytics can import the data, but only the last 90 days will show. Google should allow whatever gets imported to stay accessible. It’s a fair thing to do given how much Google unilaterally took away from publishers. It’s also a secure way to do so.
And please. Please. Don’t give me excuses about how there aren’t enough machines at Google to store all this data. If anyone can have a Google Analytics account for free to store the huge amount of data a site generates on a daily basis, Google can find room to make this search term data accessible for longer than 90 days.
Publishers Have Survived; Privacy Loophole Remains
It’s also important to reflect that in the year since this has happened, this year of Dark Google, SEO hasn’t died from the blackout nor has web publishing collapsed. The remaining referrer data still getting through and alternatives like Google Webmaster Central seem to suffice. By the way, those seeking alternative advice should take a look at the articles below:
- 5 Critical B2B SEO Initiatives, In Addition To Developing A Google+ Page For Business
- Why Enterprise SEO Shouldn’t Focus Solely On Keywords
- Life in a Keyword “Not Provided” World – SMX West 2012
- Life in a [Not Provided] World
- 51 Million Visits Analyzed [Not Provided] 16% of Google Organic Traffic
- How To Turn (Not Provided) Into Useful, Actionable Data
But make no mistake. Plenty of publishers feel frustrated with or resentful toward Google over the change. They’re also the ones who really understand that when Google talks about having done this to protect privacy, Google left a loophole to protect its advertising operations, to benefit its advertising customers.
At some point, Google’s other customers may realize this, those who search using Google. Then they might be resentful for a different reason, that Google decided privacy was worth protecting up until the point it put ad revenue at risk.
- Google To Begin Encrypting Searches & Outbound Clicks By Default With SSL Search
- The Death Of Web Analytics? An Ode To The Threatened Referrer
- Google Puts A Price On Privacy
- Keyword “Not Provided” By Google Spikes, Now 7-14% In Cases
- Google’s Results Get More Personal With “Search Plus Your World”
- 2011: The Year Google & Bing Took Away From SEOs & Publishers
- Google’s (Not Provided) Impacting More Than Just SEO Sites
- Firefox 14 Now Encrypts Google Searches, But Search Terms Still Will “Leak” Out
- How An iOS 6 Change Makes It Seem Like Google Traffic From Safari Has Disappeared