AB Testing Sample Size: The 4 Levels of Difficulty
We get a lot of questions from clients who are interested in A/B testing. However, there is one question we hear more than any other: “Can I do A/B testing with under 100,000 visitors a month?” Finding a way to get Uplift is not easy, and achieving statistical significance is downright difficult. That’s why it is important to consider your AB testing sample size before launching an experiment.
To give you a sense of the sample size needed to run AB tests on your website, we have created a simple chart. It is divided into 4 levels of difficulty, showing you how many visitors your website will need to receive to produce significant test results.
AB testing has become a major part of web-marketing. Even so, many tests are completely useless. This post will show you when AB testing actually works and when it is a waste of time and money. If the AB testing sample size is larger than the traffic a site receives in several months, then website experiments are not an option.
Before we outline our sample size guidelines, we should recap some fundamental principles:
What is AB Testing?
AB testing is an optimisation technique that consists of comparing two different page options, say page A and page B. Website visitors are split in to two groups with each seeing only one of the two pages so that you can measure what effect each page has on their online behaviour and on conversions. The goal being, of course, to see which version results in the better conversion rates.
To achieve reliable AB tests it is essential that they are statistically significant so that you don’t end up with false results and waste your marketing budget.
It’s important to take into consideration:
- The duration of the test,
- The confidence level (or statistical significance),
- The “power” level.
Unless you’re a statistician, these specifications might not mean much to you but, don’t worry, carry on reading this article and all will become clear!
How to calculate the AB Testing Sample Size?
There are many tools online that can do the necessary calculations for you. One of the best one is the AB Testguide.
To know how many visitors are required for a test and calculate the AB Testing Sample Size, you will need to provide 3 pieces of information:
1. Your Analytics data
The conversion rate of the page being tested: this is often the Subscription/Visitor ratio or the Sales/Visitor ratio.
The number of unique weekly visitors on the page being tested. Providing this information will enable you to see how many days your test will need to run for in order to reach a statistically significant number of visitors.
2. Your expectations
You need to specify the expected percentage increase that your test will generate. This can e the tricky part because it shouldn’t be the increase you would like but rather a more realistic and objective projection.
So why do you need to estimate the increase in conversion when it’s just this that you’re trying to test in fact? Well, actually, the value that you project will act in this case as the threshold for result detection. The higher this threshold, the greater the number of visitors (and AB testing sample size) required will be for a reliable and accurate result. In fact, contrary to what you might think, a large improvement percentage is easier to detect and so can be confirmed more quickly but it’s this very rapidity that can make it less reliable. Whereas, a smaller improvement percentage will be more subtle and discreet, meaning more time and visitors will be necessary to detect the change and it will therefore be representative of a more reliable result.
Conversion optimisation experts agree that it is very difficult to see an improvement of any more than 10% on your overall conversion rate (Sales/Visitors). Reaching anything beyond 10% will require “innovative” changes, meaning quite significant changes to your site. More “iterative” changes – those that consist of a simple alteration to a message or button, for example – will generally only bring about an increase of less than 5%, often far less! Therefore, in most cases, we would recommend “5%” as a realistic projected increase.
3. Statistical Parameters
Without going in to too much technical detail, these are the parameters that should give you a reliable result:
Confidence level: 95%
Using these values, AB Test Guide will give you an estimate duration required for your test but it’s always important to verify that the test will last at least 1 sales cycle and should ideally last for 2.
In general, it’s always best not to set a test to run for any longer than 30 days, with 60 days being the absolute maximum. Running a test for more than 3 months is risky because many of the parameters will change over such a long period (on average, 10% of internet users erase their cookies once a month, which will contaminate your test results).
AB Testing Sample Size: The 4 Levels of Difficulty
Click here to download a PDF copy of this article Save to read later or share with a colleague.
So now you know the AB Testing Sample Size required for you to carry out an effective AB test on your site but what happens if the amount of visitors required is larger than your actual traffic?
AB testing sample Size: 4 levels of difficulty
The graph below can help you to estimate the level of difficulty related to the AB Testing sample size required to carry out reliable AB tests.
In other words: what is the minimum increase in conversion rate needed to be confident that test results are reliable in relation to the number of unique monthly visitors?
We have fixed certain parameters for this graph:
- Site conversion rate: 2%
- Test duration: 30 days (you should NOT test longer: Cookie deletion will make your results unreliable)
- 1 control page (A) and 1 variation for testing (B)
- Confidence level: 95%
- Power: 80%
Visitors are organised in to 4 zones that represent the difficulty level, or in other words the increase in conversion that would be required for an accurate result, for reaching an Uplift (an increase in conversion rate).
What Are The 4 difficulty levels?
Make sure to always consider the data of the page tested rather than that from the site as a whole.
With less than 10,000 visitors a month, AB testing will be very unreliable because it’s necessary to improve the conversion rate by more than 30% in order to have a “winning” variation.
With between 10,000 and 100,000 visitors a month, AB testing can be a real challenge as an improvement in conversion rate of at least 9% is needed to be reliable.
With between 100,000 and 1,000,000 visitors a month, we’re entering the “Exciting” zone: it’s necessary to improve the conversion rate by between 2% and 9%, depending on the number of visitors.
Beyond one million visitors a month, we’re in the “Safe” zone, which allows us to carry out a number of iterative tests.
What should you do if your traffic is below the required AB Testing Sample Size?
It is really worthwhile doing AB tests in order to validate hypotheses so, even if your traffic is low, it’s important to still find a way of testing at least certain elements.
If your traffic is less than 10,000 visitors a month, however, it’s going to be very difficult to optimise using “iterative” methods.
Here are two things you should be doing in parallel:
- Trying to increase your traffic!
- Working on optimising your site:
- solidify your USPs: why visitors should buy from you rather than your competitors, and why they should do it now!
- test functionality: does your site work well on all browsers, devices, etc.
- gain a better understanding of your visitors and customers (mini-surveys, user testing,…).
In the 10,000 to 100,000 visitors a month zone, it’s still quite difficult: iterative tests aren’t very powerful and so it becomes necessary to create “innovative” tests in order to obtain reliable results. Using Neuroscience and Consumer Behavioural techniques can help to boost a test’s performance on top of those elements mentioned above.
Beware of micro conversions!
Micro conversions are steps that occur during your conversion funnel, for example when someone adds a product to their basket, but the final conversion is still making the sale (which is a macro conversion).
It’s pretty easy to increase the rate of micro conversions and so it can be tempting to try and base tests on these. However, you must always also measure the impact the test is having on macro conversions to have an objective understanding of the true performance of your tests.
If you optimise “too much” on one side, then it could result in a decrease in conversion on the other side: for example, if you over-optimise one page through the use of motivating messages but then the promised incentive isn’t realised on the following pages, your overall conversion rate is likely to drop. This is called the Roberval Balance.
AB testing is an important technique for validating ideas. When used alongside a coherent statistical method it can provide robust information about how to optimize a website.
However, to run tests successfully, your sample size must be sufficient.
Subscribing to an AB testing solution doesn’t guarantee results because often the main objectives of these is to sell you a 12 month subscription which can lead to certain misinformation and confusion about results given. The best way to optimise your conversions is to have wider knowledge and support in order to make informed and effective changes.