Results and tentative conclusions
We think the recently published Mercy For Animals Facebook ads study is the highest quality randomized controlled trial (RCT) so far of an animal advocacy intervention. We think the research team did a great job in planning and implementing the study, although there are many unavoidable difficulties in connecting the results of studies like this to real-world decisions about which interventions to prioritize and fund.
They measured the difference between the self-reported consumption of animal products — the primary outcome measured in the study — for participants who clicked on an MFA Facebook Ad and viewed the veg outreach video (treatment group) and participants who clicked on the ad and viewed a video on eliminating neglected tropical diseases (control group). The outcome was measured in self-reported number of servings consumed in the two days prior to the survey.
MFA also released the raw data of the results. We are in the process of conducting our own analysis. Tentatively, we make the following statistical conclusions:
- On average, participants in the treatment group reported consuming 3% more servings of animal products than those in the control group.
- The 90% confidence interval for the effect size is a decrease of consumption by 0.3% to an increase of 6.6% in the treatment group.1
We have a relatively strong feeling, based in our general understanding of psychology and advertising (such as this study on commercial ads), that MFA’s Facebook advertising strategy does not actually increase consumption of animal products. We also note some serious qualifications later in this post that limit the applicability of these results to the actual impacts of Facebook ads, especially that the treatment and control group in this study varied in the video page they saw after clicking on the ad, while both groups saw the advertisement itself.
Overall, we do expect this evidence to substantially reduce our estimate of their effectiveness, specifically the amount of dietary change expected given a certain number of Facebook ad viewers.2 We do not expect this evidence, by itself, to change our charity recommendations.
We are currently conducting an intervention evaluation of online ads,3 in part because of the release of this study. We will continue to reflect on these results and consider what our overall view of online ads should be after considering all available evidence. We expect to use this new evaluation when we update our charity reviews and recommendations later in 2016. We will also consider the implications of these results for studies that should be run as part of our new advocacy research program.
We welcome feedback during this process and expect it to be finished before this fall. If you are making time-sensitive decisions before that time — as a donor, activist, or organization — and would like to know our most up-to-date thoughts on the subject, please feel free to reach out.
The rest of this post shares some of the details of the study, as well as more detailed thoughts on its strengths, qualifications, and statistical results.
Methodology, strengths, and qualifications
The evidence available in activism is unfortunately quite limited, and we often have to make difficult tactical decisions based on intuition, speculation, case studies, and similarly weak evidence. We are very thankful for individuals and organizations working to gather more robust evidence. We hope to do additional studies like this one through the collaborative advocacy research program.
Mercy For Animals, along with independent Researchers including Peter Hurford and Jason Ketola,4 recently completed what we consider the highest quality randomized controlled trial (RCT) so far of an animal advocacy intervention. They studied the effect of MFA’s Facebook advertising strategy on self-reported consumption of animal products.
We think this study was very well-executed. Reasons for this include:
- The methodology was preregistered.
- Given estimated effect sizes and variation in responses, the sample size had a reasonable chance of showing an effect if one did exist.
- The questionnaire used to estimate dietary change seems like the best one we know of, although there are many unavoidable issues in using self-report.
The pre-registration of the methodology includes a helpful summary of how the experiment was conducted:
This will be a field study where participants are unknowingly and anonymously recruited via a Facebook ad that they willingly click. Once the ad is clicked the participant will be automatically sorted into either a treatment group where they view a website with an “anti-meat” message intended to influence diet or a control group where they view a website with a control vidieo [sic] not intended or expected to influence diet choices — currently up in the air between a video advocating donating to eliminate neglected tropical diseases or a video advocating reducing world population.
Visiting these websites will place a cookie on the user’s computer that can be used to “retarget” them via Facebook Ad Exchange and Google’s Ad Network for participation in a follow-up survey three months later. All the user will see is a Facebook ad offering them a chance at a prize (yet to be determined) for completing a survey. (To be clear, initial advertising is only on Facebook, but follow-up retargeting is on both Facebook and Google ads.)
There will be no connection between this survey opportunity and the previous vegetarian ad they clicked on three months ago. This will allow us to deliver a survey to both groups without tipping them off that it’s vegetarian related, thereby reducing bias. To be clear, both groups will see identical initial facebook ads and both groups will get identical follow up surveys after identical recontact time horizons.
This assessment will then be compared across the treatment and control group to figure out the influence of the Facebook ads and associated landing page videos on diet change. The amount of reduction in the treatment group relative to the control group, if any, will then be compared to the cost of procuring additional Facebook ads to assess cost-effectiveness.
The document also includes a helpful image of what the ads look like (we are unsure what the exact ads looked liked, but presumably the format was very similar):
The results of this study seem most relevant in determining the effects of MFA’s Facebook advertising strategy on dietary change with a weaker relevance to all online veg ads and an even weaker relevance to individual dietary change interventions as a whole (e.g. leafleting).
There are, of course, many qualifications to note before applying the results of the study to any intervention. We think most of these were unavoidable or at least very difficult to avoid with the resources available to the Researchers:
- The study only investigated the effect of visiting the veg outreach landing page, rather than the effect of the advertisement itself. We think a sizable portion of the effects of Facebook ads could come just from viewing the advertisement. This view has some scientific evidence, although we have not vetted that evidence. This consideration suggests the actual effect of the ads are greater than what would show up in this study.
- If subjects shared the links with other people, this could produce issues, such as viewers being confused at why they are seeing a control video (on eliminating neglected tropical diseases). Even the initial participant could be confused by this, which might be problematic, although it is unclear in which direction it would skew the results.
- If subjects clicked on the initial ad using a shared computer or mobile device, the person who took the survey may not have been the person who initially viewed the ad and one of the videos. This would likely make it harder to detect any difference between the effects of the control and treatment conditions.
- Even though steps were taken to reduce the risk of people realizing this was a study about the effects of the ads, they might still realize this. For example, they could notice that the survey questions were likely related to vegetarianism, and that they had seen an advertisement for vegetarianism a few months ago. This could make the actual effect more negative5 than what we see in the experiment due to social desirability bias.
- We are uncertain about how self-reported dietary change relates to actual dietary change.
Finally, we made two additional tentative statistical conclusions based on the results:
- Using a two-tailed t-test,6 we get a p-value of 0.14 for the effect found in the study, this effect, meaning there would be a 14% chance of finding an effect at least this strong (whether an increase or decrease in self-reported consumption) if there were actually no difference between the treatment and control groups. By the standard cutoff of 0.05 for p-values, this effect is not statistically significant.
- To have a statistical power of 80% in a future experiment, assuming the group means are the same as in this experiment and the common standard deviation is equal to the average standard deviation of this experiment (from both experimental groups) and using the same statistical test, it would need 3,210 participants in each group. Using our filtering of the data,7 this experiment included 933 participants in the treatment group and 863 in the control group. This means a future experiment with the same methodology will need roughly 3-4 times as many participants to have 80% power, which is a standard power level in social science research.
These results also might affect our views on the impact of interventions similar to Facebook ads like leafleting. We have not yet decided how much our views on other interventions will change based on this evidence, if at all.
We have an intervention evaluation of online ads on our website already, but we have a new template for evaluation and our views have evolved since publishing that report. This, combined with the release of the MFA Facebook ads study, convinced us that we should publish a new evaluation.
ACE Researcher Jacy Reese assisted with some of the preparation for this study (running a pilot study on Amazon Mechanical Turk), although this was before he joined ACE as a staff member. We don’t consider this a conflict of interest, but we note it for transparency.
We use a two-tailed t-test in our analysis, but note that the preregistration of the study did not commit to a specific statistical approach. Alternatively, a one-tailed t-test would tell us the likelihood of finding an increase in self-reported consumption in the treatment group if there were no actual increase (either zero effect or a decrease). If one came into this experiment specifically wanting to test the hypothesis of there being an increase in consumption, and exclude the possibility of a decrease, then a one-tailed t-test might make sense. For a similar perspective on when one versus two-tailed tests are appropriate, see this paper.
Note that the specific numbers, such as the % difference between the experimental groups, vary slightly between our analysis and that done by MFA. We expected this to happen when running our analysis because we have filtered the data differently (e.g. removing participants who didn’t respond to all survey questions). As of this post, our analysis only disregards participants who failed to answer any of the questions about number of servings consumed.