How Facebook is using AI to penalize spammy, ad-heavy websites in its News Feed
Facebook is looking to eradicate any semblance between people’s news feeds and those spammy “recommended articles” boxes littering the web.
On Wednesday, Facebook announced an update to its news feed algorithm that penalizes links to web pages that it considers of low quality. Rolling out “over the coming months,” per Facebook, the update will curtail the reach of organic posts containing these links and block ads linking to these pages from being approved in the first place.
The update will apply to ads running across Facebook, Instagram and Facebook’s Audience Network ad network and to organic posts on Facebook. It will not yet apply to organic posts on Instagram. That may have to do with the fact that only approved accounts can attach links to organic posts on Instagram — verified profiles for Stories and approved retail-related brands for non-Story posts — and those rubber-stamped groups are less likely to use spammy links, otherwise they probably wouldn’t have been approved in the first place.
Facebook is specifically targeting pages that don’t feature much original content but carry a lot of ads, particularly the annoying, offensive types like pop-ups and toenail-fungus ads.
Here’s how Facebook is going about identifying these low-quality pages, according to one of Facebook’s product managers for news feed, Greg Marra. Facebook reviewed “hundreds of thousands” of web pages and picked out the ones that had “this trait of having little substantial content and a lot of these disruptive, shocking, malicious kinds of ads,” he said. Then Facebook used a technique called machine learning to train its artificial intelligence systems on these lists in order for the computer programs to learn the patterns that characterize a low-quality page. Now that Facebook’s system has learned what a low-quality page looks like, it can evaluate links to pages that it hasn’t previously seen, including by using image-recognition technology to scan the content of ads on a page (like how Facebook scans posts on its social network to prevent revenge porn).
Since Facebook used machine learning to teach its algorithm how to recognize low-quality pages, it is “impossible to describe every single” signal that the algorithm considers when evaluating a page, said Marra. It’s sort of like how Supreme Court justices recognize pornography. But Marra outlined a few things that Facebook’s computers will seek out and that site owners will want to cut back on.
“Some of the high-level things that we’re looking at are, does the page have a significant amount of original content on the page or is it just really a super skimpy amount of content that’s the least you could put to even have something to link to. We’re also looking at things like, when you go to visit a page, is there a pop-up full of ads that gets in the way of the content that you’re trying to get to. And then we’re looking at the quality of the ads themselves. Are the ads sort of shocking ads that show, like, toenail fungus stuff? Are they really sexualized ads that might be surprising in that context? Are they the kind of the high-quality ads that people don’t mind when they see online,” said Marra.
Translation: how similar is a page to the kind featured in those “From the Web” lists at the bottom of most publishers’ articles? Or how closely does it resemble this mock-up provided by Facebook?
Facebook’s move is aimed at shady sites that check most, if not all, of the aforementioned boxes, but it could also impact otherwise legitimate sites that, for example, don’t police the bottom-barrel programmatic ads on their pages or feature pop-up ads. Since there are a lot of factors that Facebook’s algorithm is considering, those sites may not be as affected, but only time will tell.
“It’s not a black-and-white thing. It’s all shades of gray along the spectrum,” said Marra. As it initially did when training its algorithm on low-quality pages, Facebook will be relying on feedback it receives when surveying users about the links they encounter in their news feed to continually update its criteria for identifying low-quality pages.
On the bright side, sites that don’t run afoul of Facebook’s quality-assessing algorithm might see their traffic from Facebook increase as a result of the decrease in traffic to low-quality sites, according to Facebook.
Other than a decline in referral traffic from Facebook, there’s no real way for site owners to find out if their pages fall below Facebook’s quality bar. Facebook does not plan to roll out a tool for site owners to check whether pages would be considered low-quality, said Marra, explaining that spammy sites could use such a tool to find ways to skirt around Facebook’s algorithm. And Facebook will not warn people when they add a link to a low-quality page to a Facebook post, in part because “most of these types of web pages are not shared directly,” Marra said.
Facebook will penalize entire domains when enough of a site’s individual pages have been ranked as low-quality. “If we see that all of the links on your site exhibit these attributes, the next time we see one it’s like your sixth speeding ticket or something like that,” said Marra. “And if there’s a site where we’ve never seen any of this before and we see something for the first time, the AI system is kind of like, ‘Well, I’ve never seen these guys be bad before,’ so it’s less likely that this is problematic. It’s a combination of both the domain level and individual URLs.”
In tending to its walled garden, Facebook’s move mirrors an unconfirmed change Google appears to have made to its search algorithm earlier this year that also penalized links to spammy pages, as well as a change Google announced last year to crack down on mobile pages with pop-up ads.
Both Facebook and Google have become dominant business drivers for anyone looking to convert site traffic into money, from legitimate publishers to Macedonian teenagers. And both are vulnerable to gaming.
On Facebook, that has meant posts and ads peddling outlandish or outright false information designed to pique people’s curiosity enough to get them to click through to a page packed with ads and little else. Even if a person immediately clicks the back button to return to Facebook, the site can secure revenue from the ads whose advertisers are charged by the impression.
Facebook has been waging against these types of low-quality links for years. And as fake news has become a higher profile problem following last year’s US presidential election and a more prevalent get-rich-quick scheme for anyone with an internet connection, Facebook has upped its attack.
Facebook has lowered the ranking of links that receive high bounce rates; generate bigger gaps between the number of clicks and the numbers of likes, shares and comments; and feature clickbait-y headlines. And it has also penalized Facebook Pages that plead for people to like and share their posts in order to evade Facebook’s engagement-minded algorithm. The company has also started flagging posts linking to fake-news articles when outside fact-checkers confirm them as false.