Building More Realistic Media Effects Models

CUNY Data 621 - Spring 2022

Author

Jeff Parks

Adstock, How Does it Work?

Rather than discuss a specific modeling technique or statistical concept, in this post, I'd like to talk about a concept specific to the media and advertising industry that I think is fairly important, but where I have not yet seen a reliable, consistent approach for measurement.

This concept is called Adstock. It's not a great name, true. However, the concept is fairly simple. Advertisers devote considerable time, creative energy, and resources to making their messages resonate with media consumers. If we're doing our job correctly, the consumer should remember that message long after they've been exposed to it.

Further, if the consumer has been exposed multiple times to that message during a given period, there should be some sort of cumulative, lasting effect on the mind and behavior of consumers, that does not simply disappear the moment we stop serving the message.

So, Adstock is the theoretical measure of the cumulative effect of all this advertising, at any given moment. And we would expect it to gradually diminish over time, and thus we should be able to model it according to some mathematical formula.

Where is this so-called Adstock?

Now that I’ve explained the concept, I think you’d agree it’s pretty important to have some kind of understanding of how much Adstock is out there right now for your ad campaigns, and how much time is left before consumers forget your message entirely.

And yet in my experience at several agencies, little to no attention is paid to this important concept at the media measurement and reporting level.

You Can Measure Anything

I suspect the main challenge to addressing the Adstock question is the difficulty of direct measurement. How do we efficiently measure the amount of advertising that people remember?

Brand Awareness survey data is probably a close proxy for Adstock effects, but (apart from being expensive) those are usually the response variables in our equation - the active question for advertisers is usually "How much do we need to spend to earn a marginal point in Brand Awareness (and stay ahead of our competitors' scores)?"

If we're using straight media impressions as a predictor in that particular model, with no adjustment for Adstock effects, we're assuming our consumers are goldfish swimming around in a bowl, forgetting our message the moment their attention is drawn away. No offense, goldfish, but that doesn't sound reasonable.

There is a wonderful book on business analytics and estimation, How to Measure Anything written by Douglas W. Hubbard. The title on the first page of the first section reads, "The Measurement Solution Exists" and it encapsulates the message that Hubbard drives home over and over again - no matter how tricky the problem, some kind of estimation is always possible, and it always improves your level of understanding and accuracy.

Let’s Write Some Code

For a recent project with a major social media brand, I was challenged to examine some weekly time-series data of Brand Awareness survey scores and to try and figure out the relationship of those scores to the amount of media impressions that were being served in the U.S. market at those times, among other variables.

This was a great opportunity to try and test my Adstock theories – would an adjustment to the raw media impressions help produce a better model? After a survey of the Internet to try and find examples of prior work or a package to help estimate media effects (and coming up short), I decided to write something simple, plug it in, and see what happened.

Please excuse some of the code... the numbers have been truncated & altered from their original values, but they're representative.

Code

# get data: vector of numeric values representing media served in consecutive 
# periods... could be impressions, or views, or spend, etc.
media <- read.csv('input/adstock_data.csv', header=F)$V1

There are three manual variables we can adjust. The most significant is halflife_days, an estimate of days required for any media to lost 50% of its “effect” on consumers in the aggregate (set to 3 days here strictly as a guess.)

Code

# manually set these three vars
halflife_days <- 3 # estimated days for media to lose 50% effect
period_days <- 1 # 7 == 1 week
periods_addl <- 7 # additional periods to calculate after final media placement

# more vars
periods <- length(media)
periods_count <- periods + periods_addl
df <- data.frame(matrix(ncol=periods_count, nrow=0))

And for the rate of decay of media effectiveness, we apply the exponential formula:

Code

# for each element, create a vector and append values from exponential decay formula.
for(m in seq_along(media)){
  decay_vec<- c(media[m])
  for(p in 1:(periods_count-m)){
    decay_vec <- append(decay_vec, round(exp(log(0.5) * 
                        p / halflife_days * period_days) * media[m]))
  }
  # pad each consecutive vector to the right by m-1 elements, append to df
  padded_vec <- append(integer(m-1), decay_vec)
  df <- rbind(df, padded_vec)
}

Code

# fix df column labels
colnames(df) <- c(1:periods_count)

# sum up
adstock <- colSums(df)

# pad the media vector and merge to final df
media <- append(media, integer(length(adstock)-length(media)))
adstock_table <- data.frame(media, adstock)

media	adstock
1527360	1527360
45408710	46620976
56480103	93483197
26802735	101000397
20425402	100589470
15618	79853534
158	63379950
75272107	125576807
81897782	181568160
67594427	211705171
71082748	239113253
66384687	256169001
62311329	265632801
61308437	272141330
60095799	276094517
32157	219168520
0	173954168
0	138067517
0	109584259
0	86977087
0	69033758
0	54792132
0	43488543

Code

# quick viz
period_labels <- as.numeric(row.names(adstock_table))

ggplot(adstock_table) +
  geom_area(aes(x=period_labels, y=adstock), fill='green', alpha=.5, stat='identity') +
  geom_bar(aes(x=period_labels, y=media), fill='blue', stat='identity') +
  ggtitle('Media and Adstock Effect') + xlab('period') + ylab('effect') +
  scale_x_continuous(labels = period_labels, breaks=period_labels) +
  scale_y_continuous(labels = scales::label_number(suffix = " K", scale = 1e-6))

In the plot above, the blue bars represent the actual media impressions served on a given day, and the green “wave” represents the cumulative Adstock effect.

Moving Forward

As we can see from the graph, the Adstock effect can be quite substantial, even with a moderate “half-life” of three days! It seems to me that there’s potentially some value in building Adstock-adjusted features for any number of standard models and reporting techniques.

The concept seems relevant to any important questions with a time-series aspect, such as:

Is there a “saturation point” for our consumers where additional media has no marginal benefit? Can we calibrate ad frequency accordingly?
Are there different Adstock curves we should maintain based on product type or industry? (short or long customer journeys, price points, durable vs consumer goods, etc)
Are we modeling the relationships between Ad Spend and Brand Lift / Awareness properly over time?
Can we model the Adstock effects of successful, “sticky” campaigns vs. less successful campaigns?