2013 Video vs. Ad Study | Animal Charity Evaluators

Please note that this analysis is archived, as it was published in 2013, and is not up to our current standards.

This is the Animal Charity Evaluators statistical analysis of the publicly available data from a study on ad comparison performed by The Humane League and Farm Sanctuary. For more context, see ACE’s narrative analysis of this and a similar study. This document is intended to be read as a supplement to ACE’s analysis.

The code that follows is the R code used in our analysis. If you would like a copy of the data used to replicate or extend our analysis, please contact us.

vidclicks sum(Videos$GAClicks[which(Videos$Video == vidname)])
}
adclicks sum(Videos$GAClicks[which(Videos$Ad == adname)])
}
vidorders sum(Videos$MFGC[which(Videos$Video == vidname)])
}
adorders sum(Videos$MFGC[which(Videos$Ad == adname)])
}

Videos

To get a sense of the variation in effectiveness between videos, we consider the total order rate for each video, or alternatively its reciprocal, the number of clicks required, on average, to produce one literature order.

vidorders("FTF")/vidclicks("FTF") # or vidclicks('FTF')/vidorders('FTF')

## [1] 0.01529

vidorders("WCB")/vidclicks("WCB") # or vidclicks('WCB')/vidorders('WCB')

## [1] 0.02705

vidorders("TBL")/vidclicks("TBL") # or vidclicks('TBL')/vidorders('TBL')

## [1] 0.02373

vidorders("MYM")/vidclicks("MYM") # or vidclicks('MYM')/vidorders('MYM')

## [1] 0.02023

We test whether differences are significant using a chi-square test. We make a table comparing clicks that did not result in orders and orders for two videos, and test how likely the table would be to result if the distribution of clicks that result in orders was the same for each video.

	Farm to Fridge	What Came Before
clicks – orders	14042	61569
orders	218	1712

Since we’ll be running 10 significance tests in this analysis (six pairwise comparisons of videos and four tests on various groupings of ads), we’ll keep in mind that the conservative Bonferroni correction for multiple significance tests suggests that whatever p value we would take as indicating significance for a single test, we should require one of our multiple tests have a p value 1/10 that or lower to conclude that the result is significant. For instance, if in running a single comparison we would look for p < .01, we should now look for p < .001.

click.mat vidorders("WCB")), c(vidorders("FTF"), vidorders("WCB")))

chisq.test(click.mat)

##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: click.mat
## X-squared = 65.9, df = 1, p-value = 4.741e-16

Given the very low p value, the difference between these two videos is extremely unlikely to be due to chance alone. Comparisons are a bit more complicated between What Came Before and Ten Billion Lives or Meet Your Meat, since we should only use data from ads that had versions for both videos involved.

someclicks sum(Videos$GAClicks[which(Videos$Video == vidname & (Videos$Ad == "phoebe" |
Videos$Ad == "serj"))])
}
someorders sum(Videos$MFGC[which(Videos$Video == vidname & (Videos$Ad == "phoebe" |
Videos$Ad == "serj"))])
}

click.mat vidorders("TBL")), c(someorders("WCB"), vidorders("TBL")))

chisq.test(click.mat)

##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: click.mat
## X-squared = 9.277, df = 1, p-value = 0.002321

click.mat vidorders("MYM")), c(someorders("WCB"), vidorders("MYM")))

chisq.test(click.mat)

##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: click.mat
## X-squared = 6.752, df = 1, p-value = 0.009366

With the Bonferroni correction in mind, these results are not significant at the overall level of p < .01.

The other three possible pairwise tests give less significant results:

click.mat vidorders("TBL")), c(someorders("FTF"), vidorders("TBL")))

chisq.test(click.mat)

##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: click.mat
## X-squared = 3.565, df = 1, p-value = 0.059

click.mat vidorders("MYM")), c(someorders("FTF"), vidorders("MYM")))

chisq.test(click.mat)

##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: click.mat
## X-squared = 1.074, df = 1, p-value = 0.3001

click.mat vidorders("MYM")), c(vidorders("TBL"), vidorders("MYM")))

chisq.test(click.mat)

##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: click.mat
## X-squared = 0.2223, df = 1, p-value = 0.6373

Since we do not expect the same ads to be used repeatedly over time and we do not have detailed information about the characteristics of the ads, pairwise comparisons of their effectiveness would be more than we need. Instead, we are interested in the overall amount of variation in effectiveness between ads, and in variation in effectiveness due to characteristics of the ads that are known, in this case whether they were shown on Facebook or on blogs.

clicks2

orders2

test.mat

chisq.test(test.mat)

## Warning: Chi-squared approximation may be incorrect

##
## Pearson's Chi-squared test
##
## data: test.mat
## X-squared = 87, df = 14, p-value = 1.394e-12

We got a warning that the chi-square test may not be reliable in this case, probably because the expected values for too many cells are too small. We’ll try running the test on the two subgroups of ads (Facebook ads and BlogAds ads) in hopes that one may have large enough expected values for the test to be reliable.

clicksfb

ordersfb

fb.mat

chisq.test(fb.mat)

##
## Pearson's Chi-squared test
##
## data: fb.mat
## X-squared = 35.44, df = 5, p-value = 1.228e-06

The Facebook ads were each run often enough to make the chi-square test appropriate for this data, and the variation between ads run on Facebook appears to be significant. We don’t know from this whether there are ads of unusually high effectiveness, unusually low effectiveness, or both, but we do know that effectiveness varies in some way.

clicksbl

ordersbl

bl.mat

chisq.test(bl.mat)

## Warning: Chi-squared approximation may be incorrect

##
## Pearson's Chi-squared test
##
## data: bl.mat
## X-squared = 6.032, df = 8, p-value = 0.6437

The warning about the reliability of the chi-square test returns when we look at the group of ads run through BlogAds, since it contains the ads which were clicked fewer times and which led to the larger group comparison being unreliable in the first place.

Now we will test for variation between the two groups of ads we just considered, to see whether the choice of system for ad placement affects the average effectiveness of the ad, and thus what we should be willing to pay per click.

fbclicks blclicks fborders blorders

clicks4 orders4

test4.mat

chisq.test(test4.mat)

##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: test4.mat
## X-squared = 45.44, df = 1, p-value = 1.57e-11

There are differences in effectiveness between the ads shown on Facebook and the ads shown through BlogAds. Since the differences do not seem to be attributable to chance and the study did not report any systematic difference in the type of ads shown on the two platforms, it appears that the platform influences the effectiveness of the ad.