A/B Testing – You’re Doing It Wrong

How to effectively A/B test in a way that drives long-lasting results

So, your company wants to increase revenue and adoption by making some marketing site tweaks. They want more conversions, more clicks, more shares, and more users. What do they tell you to do first? Well, A/B test! Compare two versions of a page, define a key goal (ex. clicks), and see if you get more clicks. But, does this actually work? Is it really the approach you should take? Let’s look at the data.

“By nature, an A/B test is an experiment that assesses multiple (often 2) versions of a feature or page relative to a defined metric.”
This article focuses on superficial A/B tests — the testing of cosmetic changes that distract teams from delivering meaningful customer value.

A/B Testing Band-aid

For apps with millions of users, small cosmetic changes to your app, like color, layout, and language could net you some noticeable increases in your key metrics, like more clicks and user engagement. The real question is this: for companies with smaller user bases, should you try to focus on different “Sign Up!” button colors or focus on actually making your product better?

A/B testing for many companies becomes a band-aid for poor value propositions. If your content is not being shared, maybe your content is actually not share-worthy, regardless of how amazing you make your “Share Now!” button.

Moreover, maybe your goal should not be getting a prospect to click on a button. Maybe your primary goal should be to build trust, provide a sandbox demo, or empower the prospect to make a decision.

Survey Data

AppSumo estimated that only 25% of A/B tests actually produced significant results. The question is why? Well, let’s define significant results. For many of these tests, the primary metric orgs were trying to move was conversation rate. So, if the rate wasn’t increased, then the test failed.

“Less than 25% of A/B tests produced significant positive results.” -AppSumo

However, we could look at it a different way. If changing your header slogan, hero image, or CTA didn’t move metrics, then maybe that is indicative of a larger issue. A failed test should be an indicator that:

  1. Your visitors are not ready to convert
  2. Your visitors are seeking something else besides signing up
  3. Your core product is fundamentally unattractive
  4. You need to drive more qualified leads to your product

This next dataset is from a qualitative and quantitative A/B testing survey of 26 A/B testing practitioners conducted from May 1 to May 30 in 2016 (Northwestern, IDS — Justin Baker, 2016). While this is not the end-all be-all of surveys, it could still give us some meaningful insights.

Key Points

“Only 10% of A/B testing experiments resulted in actionable change — formally releasing a new version of a page or feature.”

Interview Data

To supplement the quantitative research, A/B testers (from 2 to 6 years of A/B testing experience) were asked open-ended questions regarding the efficacy of A/B testing. Here are some of the key take-aways:

“50% of teams could not make decisions from A/B testing experiments due to inconclusive or poorly measured data.”

Making A/B Testing Useful

Overall, what do these results tell us?

“Companies may be running A/B tests too frequently for too little time, contributing to a high failure rate that makes A/B test results less valuable and meaningful.”

Here are some tips to help make A/B testing useful for your application.

  1. Don’t get distracted — Changing colors, call-to-action text, and layout may have a marginal impact on your key performance metrics. However, these results seem to be very short-lived. Sustainable growth does not result from changing a button from red to blue, it comes from building a product that people want to use.
  2. Don’t put lipstick on a pig — The better version of two pigs is still a pig. If you are trying to sell a pig, then you are doing great. If not, then focus on creating better user experiences and a better value proposition.
  3. Use actual statistics — Do not rely on simple 1 on 1 comparison metrics to dictate what works and does not work. “Version A yields a 20 percent conversion rate and Version B yields a 22 percent conversion rate, therefore we should switch to Version B!” Please do not do this. Use actual confidence intervals, z-scores, and statistically significant data.
  4. The longer, the better — The longer you run your test, the better your data will account for fluctuations and extraneous variables. Do not run a test over memorial day weekend using a red/white/blue theme and then switch to that theme for the rest of the year.
  5. Failure is okay, but failure is expensive — If you keep releasing versions of your application that people hate, then what impact is that having on your metrics? If most experiments fail, then are you doing more harm then good? How much time are you spending designing and implementing A/B testing experiments? Failing and experimenting are natural byproducts of building a company. If something is not working, maybe it is not because your button needs to be a brighter color, maybe it is because you need to make your feature better.

“Failure is okay, but failure is expensive.”

TLDR

Effective A/B tests are about driving long-lasting, positive value for your customers. If you get stuck in the loop of menial changes, then you’re basically churning quicksand instead of moving your product forward.

“Effective A/B tests are about problem solving — driving long-lasting, positive value for your customers.”

Test meaningful features, use real statistics, get real feedback, and run tests for longer periods of time. Deliver true value to your users instead of playing with color hues and clever tag lines. I’m not trying to downplay the importance of ensuring that your layout is optimized, your copy is powerful, and that your information hierarchy is fluid. I am trying to get teams to think about improving user experiences by adding value and solving problems, not by throwing lipstick on a pig or trying to wordsmith a new headline.

This article was originally posted on Justin’s Medium page.

Get started with Marvel Enterprise

Get started with Marvel Enterprise

Some of the worlds most creative companies use Marvel to scale design across their organisation.

Get started with Enterprise

Sr. Product Designer at Auction.com — President of CA Product Designers — Co-Founder at TheTechLadder.com — Top Writer in Tech & Design at Medium

Related Posts

I recently worked on defining the spacing system for Practice Fusion’s EHR (Electronic Health Record) product, to ensure improved readability and consistency across all pages. I came up with 3 spacing rules (hint: rule of 3 C’s) and 4 spacing values, which worked harmoniously well with the new typographic system. The problem “When positioning elements vertically, the designer has to… Read More →

A selection of prototypes, photos, and learnings from Design Club at a special CoderDojo We recently joined CoderDojo London to deliver two days of workshops for 7–17 year-olds. We were hosted by UCL at Base KX, near Kings Cross. There were tables dotted about, offering coding activities. There were computers everywhere. Except on our table. We we were unplugged. Okay,… Read More →

“Dashboard”, “Big Data”, “Data visualization”, “Analytics” — there’s been an explosion of people and companies looking to do interesting things with their data. I’ve been lucky to work on dozens of data-heavy interfaces throughout my career and I wanted to share some thoughts on how to arrive at a distinct and meaningful product. Many people have already tackled this topic,… Read More →