Out now: Value-Based Design, the definitive way to prove your design’s worth. Read it.

Sample Size

 

Testing is worthless without valid statistical significance for your findings – and you need enough qualified, wallet-out traffic to get there. If you want a lengthy mathematical explanation for this, here’s one.

What does this mean for you in practice? Before you run any test, you need to calculate the amount of people that should be visiting it. Fortunately, there’s an easy way to calculate your minimum traffic. Let’s go into how to do this!

Maximum timeframe

First off, you should be getting the minimum traffic for an A/B test within a month’s time. Why a month? Several reasons:

  • It won’t be worth your organization’s time or resources to run tests so infrequently. You are unlikely to get a high ROI from A/B testing within a year of effort.
  • One-off fluctuations in signups – either by outreach campaigns, holidays, or other circumstances – are more likely to influence your test results.
  • Your organization will be more likely to spend time fighting over the meaning of small variations in data. That is not a positive outcome of A/B testing.
  • You will not be able to call tests unless they’re total home runs, for reasons I’ll describe below.

Sample size is calculated with two numbers:

  1. Your conversion rate. If you don’t have this already calculated, you should configure a goal for your “thank you” page in Google Analytics – and calculate your conversion rate accordingly.
  2. The minimum detectable effect (or MDE) you want from the test, in relative percentage to your conversion rate. This is subjective, and contingent on your hypothesis.

A note on minimum detectable effect

The lower the minimum detectable effect, the more visitors you need to call a test. Do you think that a new headline will double conversions? Great, your minimum detectable effect is 100%. Do you think it’ll move the needle less? Then your minimum detectable effect should be lower.

Put another way, if you want to be certain that a test causes a small lift in revenue-generating conversions – let’s say 5% – then you will need more traffic than a hypothesis that causes your conversions to double. This is because it’s easier to statistically call big winners than small winners. It also means that the less traffic you have, the fewer tests you’ll be able to call.

You should not reverse-engineer your minimum detectable effect from your current traffic levels. A test either fulfills your hypothesis or it doesn’t, and science is historically quite unkind to those who try to cheat statistics. Here's more on creating a good hypothesis in our guide to A/B testing.

How to calculate sample size

I use Evan Miller’s sample size calculator for all of my clients. You throw your conversion rate and MDE numbers in there, and calculate the level of confidence you want your test to be at.

I recommend at least 95% confidence for all tests. Why? Because anything less means you still have a high chance for a null result in practice. Lower confidence raises the chance that you’ll run a test, see a winner, roll it out, and still have it lose in the long run.

Let’s say your conversion rate is 3% and your hypothesis’s MDE is 10% – so you’re trying to run a test that conclusively lifts your conversion rate to 3.3%. Here’s an example of how I fill this form out.

Note that the resulting number there is per variation. Are you running a typical A/B test with a control and 1 variant? You’ll need to double the resulting number to get your true minimum traffic. Are you running a test with 3 variants? Quadruple the number. You get the idea. This can result in very high numbers very quickly.

If you see a number that’s clearly beyond the traffic you’d ever expect to get in a month, work on one-off optimizations to your funnel instead. Don’t A/B test. It’ll be a waste of your company’s time and resources. Testing isn’t how the struggling get good, it’s how the good get better.

“But nickd, I can launch a giant AdWords campaign, right?”

You could, yes. But first, ask yourself these questions:

  • Is AdWords traffic more or less likely to convert than organic traffic?
  • Is AdWords traffic sustainable across multiple years of A/B tests?
  • Are AdWords customers the right kinds of customers for my business?

Put another way: will you get a decent ROI on AdWords, or are you just running a big ad campaign so you can juice the numbers of your A/B test?

A better way: outreach, writing, and PR

If you have too little traffic for testing right now, there is fortunately a surefire way to get more. You should write about your field of expertise, guest post and podcast on others’ sites to increase your reach, and overall educate your audience about your specific point of view.

I have found no better substitute for this – and lord knows I walk the walk. If you want traffic, you need to toot your own horn, period.

Final thoughts

Sample size is one of those inconvenient truths about testing that people think they can work their way around. Don’t. There are better places to spend your time and energy – like promoting your authority and getting traffic through long-term outreach.

Outreach isn’t as fun as A/B testing, but remember that A/B testing is a tool for optimization – not an end in itself. The overall goal is revenue generation, and I reckon there are many ways to skin that particular cat.

← Back to the Blog Check Out Draft’s Store →