Please note that this analysis is archived, as it was published in 2013 and is not up to our current standards.
When a viewer is directed to an online vegetarian or vegan outreach video by an ad, the probability that their behavior will be affected by the video depends on many factors. Two of the factors directly controlled by the organization placing the ad are the ad itself and which video the ad leads to. We know of two studies addressing the relative effectiveness of different videos; one of them also addresses the relative effectiveness of the ads.
- About the Studies
- Which Videos Are Most Effective?
- Which Ads Are Most Effective?
- Are These Effects Separable?
- Conclusions and Further Questions
About the Studies
The first study was conducted in 2012. Faunalytics, under contract with VegFund, showed segments from four videos to an online audience of about 500 people ranging in age from 15 to 23 years. The videos all advocated a vegan or vegetarian diet, but were specifically chosen because they emphasized different reasons for diet change. Each respondent was offered an incentive to watch approximately four minutes of one of the four videos and then immediately fill out a survey. Data was also collected on engagement with the video, including the proportion of the audience that was watching at each point and the proportion of the audience that re-watched each section of the video.
The second study was conducted in early 2013 by The Humane League (THL) and Farm Sanctuary (FS). Each group ran online ads targeting a mostly young and female audience. Identical ads were run leading to two or four videos describing cruel treatment of farmed animals. Some ads were run through Facebook, while a larger number of ads that received fewer clicks each were run through BlogAds, a system that displays the ads on individual blogs. Using Google Analytics, the organizations recorded how many people reached each video via each online ad, how much of the video they watched, and what percentage of those people ordered a vegetarian starter kit. A total of 83,000 views were recorded during the study.
Although both studies focused on the effectiveness of various videos in inspiring behavior change among young people, there were significant differences in the goals and designs of the studies, so that their findings can be viewed as complementary but are difficult to view as directly comparable. The Faunalytics/VegFund study attempted to recruit equally from young men and young women, and recruitment was done through an outside agency. The THL/FS study, however, targeted young women specifically (and exclusively, in the case of many of the ads) through the same means that those organizations typically use for their online video outreach. Furthermore, only one video, Farm to Fridge, was used in both studies, and only the THL/FS study provided the videos in their entirety, while only the Faunalytics/VegFund study offered specific incentives for watching a certain amount of the video. Finally, while both studies considered immediate responses to the videos, the forms of those responses were quite different.
Which Videos Are Most Effective?
Since both studies were primarily designed to compare the effectiveness of different videos for use in outreach, it is not surprising that this is the area in which they have produced the most useful and reliable results. By showing multiple videos to very similar audiences, each study has provided reasonably good evidence about the comparative effectiveness of the videos, at least for the audiences targeted. It should be noted that neither study sought to establish the effectiveness of showing any video as compared to showing no video and that the study designs do not support any claims about the overall efficacy of online video outreach. This means that we can’t use these studies to determine whether online video outreach is in general a cost-effective intervention.
The Faunalytics/VegFund study showed clips of Farm to Fridge, Maxine’s Dash for Freedom, A Life Connected, and Geico Couple. The THL/FS study showed Farm to Fridge and What Came Before on pages connected to each ad in the study, and Ten Billion Lives and Meet Your Meat on pages connected to two ads run by THL.
Farm to Fridge, What Came Before, Ten Billion Lives, and Meet Your Meat all contain graphic depictions of animal suffering on factory farms. Perhaps one of the most salient differences in approach is that Farm to Fridge and Meet Your Meat open directly with these images, while What Came Before and Ten Billion Lives have gentler openings that seek to establish sympathy with farmed animals before the graphic depictions begin. Another difference is that Ten Billion Lives is about 4 minutes long while the others range in length from 9 to 13 minutes. This is relevant particularly in the context of the ads studied by THL and FS; it may be less relevant in a pay-per-view situation where the viewer has an incentive to finish the entire video or where the sponsoring organization might show only clips from longer videos, as in the Faunalytics/VegFund study.
Maxine’s Dash for Freedom tells the story of a cow who escaped while being transported to slaughter. It shows a few images of cows in industrial agriculture situations, but mainly encourages veganism by leading viewers to empathize with the escaped cow.
The clip from A Life Connected emphasizes the environmental benefits of a vegan or vegetarian diet by describing the resources used and pollution created by intensive animal agriculture.
Finally, the Geico Couple video describes the weight loss and health benefits that the titular couple experienced by following a workplace program supporting conversion to a vegan diet.
The Faunalytics/VegFund study compared videos along two main axes: engagement, or how much of the video the average viewer watched, and behavior change, or how many viewers indicated that they were considering reducing or eliminating their consumption of animal products. They found some statistically significant differences (p < .05) between videos on each of these axes, but most of their comparisons between videos did not lead to statistically significant results. Interestingly, the engagement and behavior change results were almost exactly opposite each other. A Life Connected had the highest rate of engagement, followed by Geico Couple, Maxine’s Dash for Freedom, and finally Farm to Fridge. However, Farm to Fridge had the highest percentage of viewers considering eliminating animal products from their diet, followed by Maxine’s Dash for Freedom, A Life Connected, and Geico Couple.
The study authors interpret these results with appropriate caution, noting that while the trends shown should not be ignored, many results might be different with a different sample of viewers and/or especially with a different population. In fact, some of the results vary if one considers similar but different questions within this study: A Life Connected had more people considering reducing their consumption of animal products than Maxine’s Dash for Freedom, even though it had fewer saying they would eliminate them altogether. The similar results for very different videos lead the authors to say, “The video content itself may be a less important factor than simply having the complete attention of a captive audience.” This is an interesting suggestion, but one that would, like all the results of these two studies, have more weight if the participants had been surveyed again some time later, since even if the immediate effects of the videos are similar, the long-term effects could be different.
The videos shown in the THL/FS study are far more similar to each other than the videos shown in the Faunalytics/VegFund study, so we might expect again to find little difference between their effects. Indeed, the study’s primary proxy for effectiveness was orders for literature on becoming a vegetarian, and order rates by video ranged only from 1.5% to 2.7%, a difference which in the Faunalytics/VegFund study would likely not have been significant. However, this study was at a much larger scale, so many comparisons did result in statistically significant differences, and some were significant at very high levels.
In particular, when looking at the performance of What Came Before compared to the performance of the other videos, on the total views from ads relevant in each case, every pairwise comparison was statistically significant (nominal p < .01, which since we did 10 significance tests overall is global p < .1 with the conservative Bonferroni correction). In every case, What Came Before performed better than the other video. No other pairwise comparison produced significant results, but it is important to note that this is likely not the result of What Came Before being dramatically better than three similarly-effective videos. Rather, due to quirks of the ad systems used, the ads leading to What Came Before were shown significantly more often than the ads leading to other videos, making it much easier to achieve statistical significance in tests involving What Came Before than in tests between any two other videos. It is likely (and almost certain in the case of the comparison between What Came Before and Farm to Fridge) that What Came Before is more effective than the other videos by the standards used in this study. However, the differences between the other videos may well be as great as those between them and What Came Before. In fact, if the estimates produced by this study are exactly right, the gap in effectiveness between Meet Your Meat and Farm to Fridge is larger than that between What Came Before and the next best video, Ten Billion Lives.
Which Ads Are Most Effective?
Since the THL/FS study tracked effectiveness for each video and ad pairing, their data also provides information on the relative effectiveness of the ads used in the study.
For several reasons, this data is not as widely applicable or as easy to interpret as the video data. First, many more ads were used, and the number of clicks for different ads varied widely, so that the least-clicked ad had only 123 clicks (as compared to 939 clicks for the least-shown video). Second, we have only the titles for the ads, not the ads themselves, so for the most part we can’t compare ads based on their characteristics. Finally, and most importantly, while future online ads can continue leading to the best currently available video until another study shows that a new video is better, the ads may saturate their markets more quickly and have to be replaced more often. Many of the ads refer to specific musicians or actors and are targeted to people who “like” those artists on Facebook, potentially a small enough segment of the overall group of young women that a large proportion of them will see an ad within a relatively short time—and even if they don’t, eventually the artist will decline in popularity or the young age group targeted by the ads will no longer overlap the age group most interested in the artist, and the ad will need to be replaced.
We can, however, make two conclusions from the data. First, there is great overall variability in ad success, both in terms of number of clicks and in terms of the rate at which people who click on the ad go on to request vegetarian literature. For organizations paying by the click (instead of by the impression, which is the industry term for each time an ad is shown), the first type of variance is less important, and it is harder to understand from this data, since we do not know to what extent it originates in differences between the ads and to what extent it is due to the ads having been shown different numbers of times. One possible estimate of the variance in click rates comes from considering only the BlogAds ads, since that system appears to have less unexplained variation than the Facebook ads system, with pairs of identical ads receiving very similar numbers of clicks. For these ads, the maximum number of clicks was 2814 and the minimum was 123; the median was 848. If these differences are in fact not due to the ads having received very different numbers of impressions, organizations paying by the impression should be especially careful to choose ads that will be appealing, or to carefully monitor the analytics for various ads at the beginning of each campaign. Other organizations may also find this useful, as ads that receive few clicks may generate worse public opinion than ads that receive relatively many clicks. Ads that are receiving few clicks may appear irrelevant to viewers, while ads that receive many clicks may seem relevant even to viewers who do not click them, leading them to think on their own about why their favorite celebrity doesn’t eat meat or how a pig might feel.
Analytics are also important for tracking the differences between ads when it comes to the proportion of clicks leading to literature orders. There appears to be slightly less variation here, but the variation that does exist is still significant: the best ad garnered one literature order for every 31 clicks, while the worst took around 296 clicks to get a single order. (One ad never led to an order, but since this was the ad that only got 123 clicks in total, we don’t actually know that it was the worst ad. And since ads that tended to do badly in their rates of orders also tended to be clicked less often, we have less certainty about how many clicks it would take these ads to get an order, on average, than we do for ads that performed relatively well.)
An organization that checks analytics will tend to do better than one that doesn’t, though not as much better as one might think, if the study conditions are similar to those for organizations that run ads without checking the analytics for each one. Over the whole study, there was one literature order for every 40.3 clicks, compared to one for every 30.8 clicks on the very best ad. An organization that ran a broad range of ads would have to show about 1.3 times as many ads to get each literature order as an organization that showed only its best ad, if the ads in this study are representative. This difference is certainly worth a few minutes looking at Google Analytics, so any group that is running online ads should track this kind of data and act on it to eliminate ads that are performing poorly. The value of larger-scale investigations into the optimal types of ad to produce may be tempered by the limited lifespan of each ad and the difficulty of identifying general characteristics that lead ads to perform well. It may be possible to achieve the benefits of such an investigation at lower cost simply by field-testing ads continuously.
Finally, the overall location of the ad also matters. The ads run on facebook, as a group, performed significantly better than the ads run on BlogAds, with average clicks per literature order of 38.1 and 66.5, respectively. Of course, for cost-effectiveness purposes it’s still possible that the two groups were comparable or even that the BlogAds performed better, since we don’t know the costs per click on the two platforms, but the difference is enough that organizations should make careful choices of platform and context for their online ad programs.
Are These Effects Separable?
Above, we analyzed the effectiveness of the videos and the ads separately, but the THL/FS study was actually not so rigorously controlled that we can easily separate the two effects. Because some videos were not shown with all ads, and the others were not shown with all ads in the same frequency, some of the effects attributed above to videos might actually be due to ads and vice versa. Furthermore, the ads are targeted to and clicked by slightly different audiences, and some of those audiences may be more or less likely to be moved by a specific video than the overall demographic targeted by all ads in combination.
Above, we dealt with these concerns exclusively by restricting our comparison to those ads that had a version for all videos, when considering how the two videos not shown with all ads compared to the other videos in the THL/FS study. We ignored any potential interaction effects and any other effects of uneven sampling.
One way to make sense of the possible interaction effects and judge the potential severity of problems arising from ads leading to videos with variable frequency is to look at a table of literature order rates by video and ad. The table below shows the average number of ad clicks required to generate one literature order for every video/ad pair tested. When a particular video/ad pair was shown but did not generate any literature orders, it is marked “high”, although because some pairs were not clicked often, some of the “high” rates may not in reality be higher than other rates in the table.
What Came Before did better than Farm to Fridge when paired with all but three ads, one of which resulted in no literature orders for any video, so it seems unlikely that the difference in effectiveness between these two videos can be attributed primarily to What Came Before being shown more often with more effective ads. Furthermore, although Farm to Fridge may be more effective when paired with certain ads, its effectiveness with those ads is comparable to the effectiveness of What Came Before with roughly the better half of the ads tested, so testing ads in combination with What Came Before looks like an appropriate way to find new pairings that work well, even if there are interaction effects.
However, things are less clear when we consider comparisons including Ten Billion Lives and Meet Your Meat. The two ads used in combination with all four videos were the ads most successful in combination with What Came Before—but they weren’t the most successful in combination with Farm to Fridge, and they may not have been particularly good ads to show with the other videos either. Because we don’t have more ads tested in combination with these two videos, it’s hard to rule out the possibility that one or both of these ads just wasn’t a good fit for one or both of the videos. Another session of testing using different ads might show different results for Ten Billion Lives or Meet Your Meat as compared to What Came Before, which is a practical concern given that the ads regularly change as discussed above.
Conclusions and Further Questions
Taken together, these studies suggest that there are definite but subtle differences in effectiveness between videos and ads that encourage viewers to adopt a vegan or vegetarian diet. Because there may not be a single video that performs best according to all measures, and because the differences between some videos on some measures may not be apparent with a small sample size of viewers (even if significant in the long run), there are significant challenges to determining the best video to use for a given intervention and audience. Tentatively, videos that show graphic footage of animals in factory farms may do better with young adult audiences than videos that omit such footage completely, but the greater success of What Came Before as compared to Farm to Fridge suggests that graphic footage is most effective when accompanied by other images and messages. The single conclusion with greatest certainty, in fact, was that What Came Before works better in the context of the THL/FS ads than Farm to Fridge does. However, no other comparison of a pair of videos resulted in an indisputable finding that one is better than the other, and in particular the relative effectiveness of Ten Billion Lives and Meet Your Meat as compared to other videos is not particularly clear, even though they were part of the larger study.
These studies represent a solid beginning to understanding the comparative effectiveness of various videos, but leave open some crucial questions. Most importantly, are the differences found in the intentions and immediate behaviors of viewers aligned with long-term behavior change? Which measures of short-term impact best predict long-term change? Until we can answer these questions, we won’t know whether acting on the information produced by studies like these actually improves the long-term performance of campaigns. On a less ambitious plane, it would be valuable to identify which videos perform best in pay-per-view situations with greater certainty. It would also be of interest to improve the measures available to evaluate videos viewed online, including measuring the frequency with which viewers share them and measuring viewing times for all visitors to the page, instead of only times for non-bouncers.
ACE. Data analysis of The Humane League/Farm Sanctuary video/ad comparison study.
Humane League Labs. (July 19, 2013). Report: Which factory farming video is more effective?
Faunalytics. (2012). Video comparison study: Youth response to four vegetarian/vegan outreach videos.
A Life Connected
Farm To Fridge
Maxine’s Dash for Freedom
Meet Your Meat
Ten Billion Lives
What Came Before