Last week Adweek reported that a significant amount of the non-human traffic burning a hole in advertisers’ budgets comes from so-called “good bots” that visit pages not to rack up fraudulent ad impressions, but for less nefarious purposes like indexing pages for search engines. Their creators? The Googles and Microsofts of the world.
But Jason Shaw, director of data science at Integral Ad Science, says that’s not really a problem, because the vast majority of harmless bots are clearly identified and screened out before an ad is ever served.
“What it comes down to is the fact that anybody operating at a significant scale — somebody like Google or Bing, who’s deploying these [bots] for their own services — they’re doing it in a very above-board way,” Shaw said. “Anyone who’s doing it without following best practices is either operating at a much smaller scale than Google or Bing, or they’re part of the malicious side of things, where they’re trying to rack up ad views.”
There are a lot of legitimate reasons for using bots (or “agents,” if they’re the good kind) to surf the web, like scanning and indexing pages for search engines, or collecting info from a public weather or stock price site. Security firm Incapsula estimates that good bots make up as much as 27% of all web traffic.
But Shaw says the number of those bots that are creating chargeable ad impressions is quite low. That’s because publishers and tech vendors know all about these bots (since nobody’s trying to hide the fact that they’re there) and the online ad ecosystem has developed reliable ways to screen them out.
The main way that’s done is with what’s called the International Bots and Spiders List, a public list created by IAB of all the most common aboveboard bots like Google search crawlers and the Internet Archive. Every publisher, exchange and ad server has access to the list, so they can detect incoming friendly bots, and decline to serve them ads. ComScore, Nielsen and other measurement providers also know to exclude them from their unique user counts.
While some companies don’t declare their bots on the IAB’s list (after all, pretty much anyone can write a bot to read the forecast off Weather.com and add it to the app they’re developing), Shaw says few of these generate enough traffic to even register on publishers’ radar. Malicious botnets on the other hand, which can generate tens of millions of impressions a day, are a much, much bigger problem.
“It’s not the goal of Google crawlers to generate volume at a significant scale,” Shaw said. “A crawler like that needs to crawl a page only once or twice. Whereas a bot whose goal is to generate ad revenue, or at least traffic the website is paying for, they’ll hit the site over and over and over again.”
And even if a few of them do get through, bot-catching firms are pretty good at stopping them anyway. Al Torres, vice-president of business development at U.K.-based security and verification firm Telemetry, said that from the perspective of an advertiser, whether the bot’s intentionally wasting their budget or not doesn’t really matter. If the publisher or advertiser employs a security firm, it’s their job is to get them all.
“Unless it is a major fraud vehicle that we would investigate in-depth, we don’t differentiate between ‘bad’ bots and ‘good but unlisted’ bots,” he said. “We filter bot traffic beyond the IAB list by looking at delivery data for a myriad of key identifiers; hence our clients do not pay for a majority of those false impressions.”
“If you’re running a crawler without identifying yourself, that in itself is already suspicious, so it’s not a stretch to report it as such,” Shaw added. “But like I said, I think in the end you’d find that bot traffic dwarfs it anyway.”