Ad Fraud Engineering - Software Engineering Daily Podcast

In this episode, Praneet and Shailin return to the show to discuss how advertising fraud is getting worse–not better. Praneet and Shailin worked with BuzzFeed reporter Craig Silverman, who was a previous guest on the show to talk about his remarkable findings about mobile advertising fraud, which accounts for hundreds of millions of dollars in theft every year.

BuzzFeed: These Hugely Popular Android Apps Have Been Committing Ad Fraud Behind Users’ Backs

BuzzFeed’s Craig Silverman authored a report on a series of mobile apps that are used to enable ad fraud.

Eight apps with a total of more than 2 billion downloads in the Google Play store have been exploiting user permissions as part of an ad fraud scheme that could have stolen millions of dollars, according to research from Kochava, an app analytics and attribution company that detected the scheme and shared its findings with BuzzFeed News.

Seven of the apps Kochava found were engaging in this behavior are owned by Cheetah Mobile, a Chinese company listed on the New York Stock Exchange that last year was accused of fraudulent business practices by a short-seller investment firm — a charge Cheetah vigorously denied. The other app is owned by Kika Tech, a Chinese company now headquartered in Silicon Valley that received a significant investment from Cheetah in 2016. The companies claim more than 700 million active users per month for their mobile apps.

Kochava and Method’s Praneet Sharma analyzed the apps and found that the Kika Keyboard app executed click flooding and injection using the company’s own proprietary software and with functions built directly into the app itself.

eMarketer: Is App-Install Fraud on the Rise?

The topic of this article from eMarketer’s Ross Benes is mobile attribution (app-install) fraud.

eMarketer estimates that $7.1 billion will be spent on mobile app install ads in 2018, up from $6.5 billion last year. App-install fraud refers to the practice where a company falsely gets credit for getting a user to download an app.

Fraudsters are trying to make a buck off the ad dollars that flow to mobile. Over the past 12 months, mobile attribution firm AppsFlyer analyzed 17 billion app installs across 7,000 apps worldwide. It found that the amount of install fraud roughly tripled. More than one-quarter of the installs that AppsFlyer analyzed in that timeframe were fraudulent.

Method Media Intelligence’s Praneet Sharma commented, stating that "there are two reasons that persistent install fraud is on the rise: utilization of low-fidelity identifiers and spoofing of attribution tracking."

BuzzFeed: The Partisan Meme Wars Have Come For LinkedIn

Facebook and Twitter’s crackdown on hate speech, false news, and manipulation has caused some people to move their political content sharing to LinkedIn. The result is an increase in MAGA and #Resistance memes and intense, sometimes, vitriolic, political discussions. This spike in political content has also led to the familiar problems of fake accounts, false claims and memes, and comment threads that devolve into name-calling and sometimes threats.

BuzzFeed News began examining political content on LinkedIn after Shailin Dhar, the CEO of Method Media Intelligence, said he’d noticed an increase in accounts sharing hyperpartisan content on LinkedIn. He began making a list of the accounts because “aggressive and partisan political rhetoric is generally uncommon on a professional networking site.”

"I began noticing accounts with strange pictures and a spike in aggressive political conversation and posts,” Dhar told BuzzFeed News by email. “It was just a few accounts at first but as I continued to follow them, I saw that they were getting more engagement.”

He shared his list of profiles with BuzzFeed News, which expanded it to roughly 100 by reading comment threads and searching for political content. 

Data Center Traffic Demonstration

We have released a demonstration of how web traffic can rapidly be generated using AWS EC2. In just 5 minutes and a few mouse clicks, we generated over 1000 concurrent visitors from 290 cities around the world.

In this case, the traffic was sent to a site maintained by Method Media Intelligence with no advertisements. But everyday, data center traffic is directed around the web to consume advertising and causes financial harm to advertisers.

Method Media Intelligence offers Proactive Auditing as a solution to this scenario. Proactive Auditing is a real-time auditing service which prevents ads from being rendered (and paid for) if a site is visited by a data center. Contact info@methodmi to learn more.

Why Ad-Verification Solutions Should Not Sample Impressions

Introduction - How Sampling is Currently Used in Ad Verification

Sampling is used to estimate a characteristic of a larger population. Sampling is an appropriate method when direct measurement of the entire population is overly burdensome or impossible to do within the required time. Sampling is not appropriate in cases where every item in a data set can be measured quickly and cheaply.

Ad-verification vendors sample impressions for two reasons:

  1. The length of the verification process is longer than the auction cycle of selling an ad-impression. In this case, analyzing 100% of impressions would significantly hinder or prevent ad-delivery by causing timeouts* and non-renders.

  2. Verification methods rely on computationally expensive behavioral checks. To reduce costs, vendors measure a subset of supply to provide acceptable pricing to clients.

*In all cases of time-outs, advertisers lose opportunities to reach consumers. If an advertiser is not on ad-server billing, they will be paying for each non-render event.

Types of Users

Internet users can be divided into three categories, human users (48%), “good” bots (23%), and “bad” bots (29%). Publishers create value and generate revenue by selling their online real estate to advertisers, who pay for access to human users’ eyes. Advertisers lose when those ads are displayed to bots, both good and bad.

“Good” bots have numerous legitimate uses. For example, good bots include search engine crawlers that are critical for keeping the internet running smoothly. Good bots primarily operate from data centers, as opposed to human-operable machines.

If websites blocked search engine bots, their content would disappear from search results and the website would lose many human users overnight. If search engine crawlers stopped visiting websites, the search engine would lose its value. If your website shows up in a Google search, it is only because the Googlebot has visited your site. Therefore, it is necessary for these bots to visit websites and the digital media industry must adapt to find a way to prevent ads from being served to them.

“Bad” bots include content scrapers, headless browsers, botnets, and other unwanted visitors to a site. Like good bots, they consume advertising and server resources, but they offer no benefit in return. Bad bot traffic can originate from data centers or human-operable devices. Effective ad-tech partners prevent their clients’ media spend from being consumed by “bad” bots.

In summary, advertisers, agencies and publishers must accept that both good and bad bots will be visiting their websites. Those serving the interests of advertisers must focus on how to measure and prevent the delivery of ads to both good and bad bots. Each ad viewed by a bot is a waste of advertiser funds and publisher resources. Advertisers must not be billed for impressions viewed by good or bad bots.

The Problem: How Advertisers Lose When Using Sampled Data

Imagine the following scenario:

  1. An advertiser’s agency enlists a verification vendor to monitor its digital media spend. The vendor monitors a $1M campaign and samples 10% of impressions. Of those 10%, the advertiser measured 25% bot traffic (fraudulent).

  2. The advertiser extrapolates the data and determines that 25% of their $1M campaign was spent on waste, and asks for a refund of $250k from its DSP.

  3. The DSP refuses to refund the advertiser until it speaks with the SSP’s and Exchanges in its supply chain. The SSP’s states that the verification vendor’s sample cannot be guaranteed to be representative of the entire campaign.

  4. The DSP also tells the advertiser that impressions are not all the same cost. 25% of impressions being fraudulent does not mean 25% of spend was wasted. The 25% waste could be on low CPM impressions, and the 75% on high CPM impressions. Therefore, a 25% refund could be excessive.

  5. The advertiser does not have the data to refute these two points, and accepts the waste as “the cost of doing business”.

The Solution

Method believes the following must be true to protect the interests of advertisers and avoid the above scenario:

  1. Verification vendors only provide actionable analytics when verifying every impression.

  2. Advertisers must have rapid access to full receipts of campaign spend (data for every impression).

As shown by the example above, only complete monitoring can be used to recover funds spent on waste. Analytics on every impression (cost, domain, IP, human/bot) are required to calculate the exact amount wasted advertiser spend. But before advertisers can be refunded for previous waste and prevent waste in future campaigns, they must first obtain this data.