Some have noticed that pages from Digg were no longer appearing in Google today. Both the social media news site and Google confirm this was the case, due to mistaken malware issue. It’s now been solved.
Digg: We’re Not Malware
Digg told us that the cause seemed to be that pages from Digg’s previous life continue to link to content from across the web — some of which may become malware infected — which causes Digg itself to have issues with Google.
Digg president Andrew McLaughlin emailed to say:
Digg got de-indexed this morning, but that will get fixed shortly (we’re in active touch with Google’s indexing and malware teams).
The cause appears to be the fact that the new, relaunched Digg continues to provide link resolution for all the redirect links from the old, legacy Digg (circa 2004-2012), and occasionally one of the target webpages (i.e., a target that we don’t control) gets infected with malware. When that happens, we can get dinged as a link of redirection to malware.
We maintain the legacy Digg link resolution service essentially as a public service, even though it doesn’t relate to our current business at all. We try to stay on top of the malware reports, but they aren’t automated and this morning there was evidently some kind of surge that triggered de-indexing by Google.
Not to minimize de-indexing, but nearly all of Digg’s current traffic comes direct, not from search. (Currently, the Digg site consists of our homepage and about 40 topic pages like http://www.digg.com/tag/cars)
Google: Our Mistake
While I was writing this up, I noticed a tweet from the head of Google’s web spam team, Matt Cutts, pointing to a statement he posted on Hacker News:
We were tackling a spammer and inadvertently took action on the root page of digg.com.
Here’s the official statement from Google:
“We’re sorry about the inconvenience this morning to people trying to search for Digg. In the process of removing a spammy submitted link on Digg.com, we inadvertently applied the webspam action to the whole site. We’re correcting this, and the fix should be deployed shortly.”
From talking to the relevant engineer, I think digg.com should be fully back in our results within 15 minutes or so. After that, we’ll be looking into what protections or process improvements would make this less likely to happen in the future.
It actually wasn’t just the root page of Digg. That was dropped, but so were all the pages from Digg when I (and others) checked earlier today.
At least Google didn’t mark the entire web as malware. Yep, it did that once, back in 2009: Google Gets Fearful, Flags Entire Internet As Malware Briefly.
FYI, for site owners suspected of being malware, Google launched a new help area last week. Our Search Engine Land article explains more: Google Launches Help Center For Hacked Sites
Postscript: That was fast. Digg is now back in Google, with over 1 million pages listed: