How To Do A/B Testing
step-by-step instructions, and free tools.
An A/B test compares two versions of a single webpage: A and B. Your traffic is divided between them and their conversion rates are compared. That way, it’s easy to see which version gets the most clicks, sales and sign-ups. By learning how to do A/B testing, you can optimize your website page by page.
Version A is your original webpage (it is sometimes called the “control”), while version B is the modified page (also known as the ‘treatment‘). Some of the most common things to test are page titles, product images, button colours, pricing and web forms.
Running experments is the best way to see how different kinds of content affect your goals. The biggest websites, such as Amazon, Booking.com and Facebook conduct hundreds of tests each week.
Where Do You Start With A/B Testing?
A/B testing is always based around a goal. For most people, the goal is to get more sales and increase their conversion rate. As long as you have a measurable goal, you can test which version of your webpage works best.
However, to produce reliable results, you also need plenty of traffic and a good testing platform.
- Traffic – You can’t test your website without a large enough test-sample
- Hypothesis – You need a strong hypothesis about what will improve your original page
- Testing Tool – An app or optimization software to manage your traffic and collect your results
Getting a large enough sample size is one of the major obstacles faced by first-time testers. To see how an optimization project is planned, and how to choose which parts of a website to test, read: Where to start with A/B tests?
Four years ago, the market for A/B testing tools was divided between two options: Optimizely and VWO. Since then, the number of options has grown rapidly. AB Tasty arrived, with a set of advanced analytics and, in 2017, Google launched its own solution Google Optimize. In 2020 there are no less than 30 A/B testing tools competing for your attention.
Powerful software for large marketing agencies
A sophisticated platform for business websites
Free and basic A/B testing for web developers
A simple, user-friendly A/B testing tool for marketers
For a full survey of the available software, see: A/B Testing Tools (2020)
How To Choose The Best A/B Testing Tool For Your Business
A/B testing is a good long-term investment. However, to find the right tool for your company, you need to ask yourself some questions.
1. What Skills Do You Have?
If your team doesn’t do web development, you need a tool with a webpage editor. If your team doesn’t have statistical expertise, you need built-in statistics.
2. What Kind of Volume Do You Have?
Does your traffic match your ambition? To get reliable results you need between 10-100,000 visitors a month on each page you want to test. For multivariate testing, you need even more.
3. Will You Need Other Tools?
To optimize your site effectively, you may need more than one tool. For example, you might think about using heat maps, scroll maps or a visitor survey pop-up.
Asking the right questions could save you a lot of money and time, allowing you to focus on developing new content: 10 Questions to Choose The Best A/B Testing Tool for Your Business
Optimization experts use a number of different strategies, but most large websites and agencies follow a continuous, step-based pattern.
How To Do A/B Testing: Step 1 – Analysis
Before you’ve even begun to think about what changes you are going to test, you need to analyse your original page. Google Analytics is a useful tool, because it tells you how visitors are using your site. By examining this data, and finding weaknesses in your “Conversion Funnel”, you can identify what needs to be changed. An experiment should always start with a strong hypothesis, and these are some helpful tools:
How To Do A/B Testing: Step 2 – Hypothesis
You have to think about what might improve your original webpage. A good A/B testing hypothesis needs to be clearly defined and based on your data. It should also be directly related to your KPI and have a good chance of producing results. Scoring your hypothesis against a checklist like this will help you to decide whether it is strong enough.Create Table
Goals – How closely does your hypothesis relate to your KPIs?
Page – Are the target pages a priority?
Position – Will the change take place above or below the fold, and how visible is it?
Value – Will the treatment have an impact on the value you offer a customer, or is it just cosmetic?
Evidence – Are there previous examples of a treatment like this working?
A combined score of over 20 suggest that your hypothesis is very strong and has a good chance of working.
How To Do A/B Testing: Step 3 – Design
It is important to be precise about the settings for your experiment. Before launching a test, you need to decide on a goal, which pages to target and how you will divide your traffic. Most A/B testing tools use a “Multi-Armed Bandit” to divide visitors between your pages.
What is an A/B testing Multi-Armed Bandit?
A Multi-Armed Bandit is an algorithm that decides how to allocate your traffic (between your test pages). Usually, it sends more of your traffic to the best-performing pages.
You also need to decide on the “Confidence Level” you expect to reach. A 95% Confidence Level is the standard for most agencies.
What is a Confidence Level?
A Confidence Level is the smallest probability (given as a percentage) you are willing to accept that your results are not the result of random variation. Setting a Confidence Level of 95% means that there is only a 5% chance (1/20) that your results were produced by chance.
How To Do A/B Testing: Step 4 – Experiment
During your experiment, it is important that you avoid biasing your results by interfering with the traffic that reaches your test page. Using paid advertising to increase your visitors will change the quality of your leads, giving a false impression about which page version is best.
How To Do A/B Testing: Step 5 – Interpretation
Even if you achieve an uplift with statistical significance, it is still a good idea to change things gradually. This is because website changes often have unexpected effects. For example, version B might lead visitors to make a purchase more frequently – but it might also reduce the average amount people spend.
The most important concept used for interpreting your results in A/B testing is “Statistical Significance” —the probability that the difference between the conversion rates of two webpage variations is the result of real changes in consumer behaviour.
The Confidence Level most widely taken to indicate that there is a 95% chance that the results are significant. This means that 19 times out of 20, the variation that we have chosen as the winning one is the true winner. The probability that the results are irrelevant and merely due to chance is 1/20.
For a full guide to A/B testing statistics, see: A/B Testing Statistics
An A/B testing tool will tell you when your tests have reached Statistical Significance at the Confidence Level you have set. It will also tell you the probability of your results representing a False Positive or a False Negative.
You can check whether your tests are Significant by entering your results into an A/B testing calculator.
For a full guide to A/B testing statistics, see: AB Test Significance
Finding a way to get Uplift is not easy, and achieving statistical significance is downright difficult. That’s why you need to think about your A/B testing sample size before you launch an experiment.
To give you a sense of the sample size needed to run A/B tests on your website, we have created a simple chart showing how many visitors your website will need to produce significant test results.
1. Fear Factor zone
With less than 10,000 visitors a month, A/B testing will be very unreliable because it’s necessary to improve the conversion rate by more than 30% in order to have a “winning” variation.
2. Thrilling zone
With between 10,000 and 100,000 visitors a month, A/B testing can be a real challenge as an improvement in conversion rate of at least 9% is needed to be reliable.
3. Exciting zone
With between 100,000 and 1,000,000 visitors a month, we’re entering the “Exciting” zone: it’s necessary to improve the conversion rate by between 2% and 9%, depending on the number of visitors.
4. Safe zone
Beyond one million visitors a month, we’re in the “Safe” zone, which allows us to carry out a number of iterative tests.
Use the significance calculator to see if your A/B test results are significant. It’s best to bookmark the page so that you can check your Significance regularly.
You can also use the calculator to see what kind of sample size will be required for an experiment in which you expect to produce 3%, 5% or 10% uplift.
In January 2020, a Convertize user increased the CTR on one of their key pages by 300% within one month of testing.
The A/B test involved adjusting the text on a key CTA button. Rather than using a simple noun, the new button used a clear verb to describe what the visitor could do. By presenting the contact form from the visitor’s point of view, our client achieved a huge uplift in click-through rate.
Test: CTA button text
KPI: Click-through rate
What Are The Most Common
A/B Testing Mistakes?
A/B testing can be tricky, and it’s all too easy to invest time and money without producing any results. These are some of the most common mistakes.
- Prioritising the wrong things
You can’t test everything, so you should prioritise your tests.
- The best way to prioritise your tests is to make a table and compare your pages and hypotheses against the following criteria.
- Potential: the potential improvement from a successful test
- Importance: the volume of traffic on the tested page and the proportion of your conversions in which it is involved
- Resources: What is needed to run the test and apply the results
- Certainty: The strength of the evidence that your test will produce a positive result
- Testing Too Many Things at once
An A/B test involves comparing A with B. In other words, testing one thing at a time. A common mistake is to create too many scenarios within the same test.
Running too many alternative versions will extend the time each test takes to produce reliable results. Again, this comes down to the question of sample size.
- If you are running a normal A/B test, you should test one element at a time and create a maximum of 3 variations. Each additional scenario increases the sample size required to reach significance, without really improving the insight you produce.
- Stopping The Test Too Early
“Too early” simply means that you stop your test before the results are reliable. This is one of the most familiar A/B testing mistakes, and possibly the most important to avoid. It is easy to commit a Type I or Type II statistical error simply by responding to your results prematurely.
- Most CRO agencies have a strict “No-Peeking” rule when running a test. That way, nobody is tempted to end a test before Statistical Significance is reached.
For more common A/B testing mistakes, see: A/B Testing Mistakes
In 2019, a Convertize user increased the conversion rate of their category pages when viewed from mobile devices.
The test involved a simple notification introducing a bundled offer, in which the products on sale could be customised for free.
Although the offer was already visible on the webpage, the notification produced a 37% uplift in conversions.
Test: Pop-up Notification
KPI: Conversion Rate
Does A/B Testing affect SEO?
Google clarified its position regarding A/B Testing in an article published on its blog. The important points to remember are:
- Use “Canonical Content” Tags. Search engines find it difficult to rank content when it appears in two places (“duplicate content”). As a result, web crawlers penalise duplicate content and reduce its SERP ranking. When two URLs displaying alternate versions of a page are live (during A/B tests, for example) it is important to specify which of them should be ranked. This is done by attaching a rel=canonical tag to the alternative (“B”) version of your page, directing web crawlers to your preferred version.
- Do Not Use Cloaking. In order to avoid penalties for duplicate content, some early A/B testers resorted to blocking Google’s site crawlers on one version of a page. However, this technique can lead to SEO penalties. Showing one version of content to humans and another to Google’s site indexers is against Google’s rules. It is important not to exclude Googlebot (by editing a page’s robots.txt file) whilst conducting A/B tests.
- Use 302 redirects. Redirecting traffic is central to A/B testing. However, a 301 redirect can trick Google into thinking that an “A” page is old content. In order to avoid this, traffic should be redirected using a 302 link (which indicates a temporary redirect).
The good news is that, by following these guidelines, you can make sure your tests have no negative SEO impact on your site. The better news is that If you are using A/B testing software, these steps will be followed automatically.
Does A/B Testing affect website loading speed?
A/B testing software can reduce loading speed due to the way in which it hosts competing versions of a page. A testing tool can create scenarios in two ways:
- From the client’s side (front-end)
- Using server-side scripts.
Server-side – This form of A/B testing is faster and more secure. However, it is also expensive and more complicated to implement.
Most A/B testing software operates on a client-side basis. This is to make editing a site as easy as possible. In order to reduce the impact of testing on a page, the best A/B testing solutions have found ways to speed up page loading.
How Do You Choose What To Test And When?
You can calculate the number of experiments you can run in a single year by entering the following details into a sample size calculator and finding out the number of days required for each test.
- Your page’s weekly traffic
- Your page’s current conversion rate
- The uplift you can reasonably expect to achieve
Once you know how many tests you can run on which pages, you can put together a full testing strategy for your website. You should start by focusing on the pages that will take the shortest amount of time to produce significant results, as these will tend to be the ones that provide you with the most “low hanging fruit”.
What to test
This will depend on where you are in the A/B testing process. Optimising a website that has a very low conversion rate allows you to take more risks with your edits. You may even decide to try different kinds of offer or value propositions.
For later-stage tests, when you are optimising for more marginal gains, each test should focus on a more discrete variation. These are the kind of experiments favoured by large companies like Amazon and Google.
What Are The Most Common A/B Tests For Different Industries?
A/B Testing can be used to increase your conversion rate, reduce bounces on key pages, improve engagement with your content and reduce your cart abandonment rate. However, your goals will depend on your industry.
- How To Do A/B Testing For Media/Publishing
- – Grow their readership
- – Increase the time readers spend on their site
- – Boost social sharing
- To do that, they will A/B test:
- – Headlines
- – Sign-up forms and processes
- – Content recommendation software
- How To Do A/B Testing For Travel/Leisure
- – Get more bookings
- – Improve a mobile or web app
- – Add upsell items to their packages
- To do that, they will A/B test:
- – Site search functions
- – Search results pages
- – Checkout processes
- How To Do A/B Testing For eCommerce
- – Reduce abandoned checkouts
- – Increase average order value
- – Get more first-time customers
- To do that, they will A/B test:
- – Promotional content
- – Product pages
- – Basket and checkout processes
- How To Do A/B Testing For SaaS/Startups
- – Increase qualified leads
- – Get more free trial sign-ups
- – Decrease churn
- To do that, they will A/B test:
- – Lead forms
- – Email sequences for new sign-ups
- – Payment pages
What Is The Difference Between A/B Testing, Multivariate and Split?
An A/B test compares two page versions with the same URL. A split test works the same way except it uses different URLs for each page version. Multivariate testing is just A/B testing in which a number of variables are tested independently and in combination. Each type of test has its own advantages and disadvantages:
- A/B testing shows you if one version of your content is more effective than another. However, it doesn’t show you how different variables affect each other. It can also cause a “Flicker” effect (or F.O.O.C, “Flash of Original Content”) if your alternative page version loads too slowly.
- Split testing avoids the Flicker problem caused by slow A/B testing, but it makes the link to your content more complicated. It also increases the chance that your test will have an impact on your search engine presence.
- Multivariate testing is the most accurate way for larger websites to test multiple different versions of their content, but it requires a far larger amount of traffic than A/B testing. Because of this, it is only a realistic option for a handful of websites.
A/B testing is full of specific technical terms, but most of them have simple definitions. Here are some of the terms we get asked about most frequently.
What is A/B/N Testing? In an A/B test, your visitor sees version A or version B of a particular page. A test with three page versions might be called an A/B/C test. “A/B/n” is used as shorthand for a test that has “n” different versions.
A/B TESTING FLICKER
To avoid the flicker effect, you need to ensure your page loading speed is as fast as possible. Convertize does this with its Lightening Mode feature.
What is a Control in A/B testing? Your Control is the unedited version of your webpage. Rather than running an alternative version against historical data, it is important to run both pages simultaneously (even though this will extend the duration of your tests). The reason for this is that your conversion rate can change independently of the design of a particular page. For example, it would be misleading to compare the conversion rate of your Control in November with the conversion rate of your Treatment in December.
What is an Element in A/B testing? An element is any part of your webpage that can be changed during a test. Common elements to experiment with include: CTA buttons, H1 and H2 titles, images and links.
What is a KPI in A/B Testing? A “KPI” (Key Performance Indicator) is a metric used to judge the performance of your webpage. It is important that the KPI you choose is quantifiable and relevant to your business goals. For eCommerce, the most common KPI to choose is a sale. For other industries, it might be a form completion, an email sign-up or any other measurable action.
What is a Multi-Armed Bandit? The “Multi-Armed Bandit” is an algorithm that directs traffic to better-performing pages. Some A/B testing tools allocate traffic evenly between variation A and B. This could mean directing half of a website’s traffic to a bad version (which can cost large companies vast amounts of money). A Multi-Armed Bandit stops that happening.
What is Multivariate Testing? “Multivariate” or “Multivariable” (MVT) tests are simply ones in which you carry out A/B tests (or A/B/n ones) on several independent page elements simultaneously. There are different ways of carrying out multivariate (MVT) tests. You can show every single possible combination of page elements (“full factorial”) or only a fraction of them. The various kinds of “fractional factorial” test often have complicated-sounding names (for example, “Taguchi”). You can also choose to display losing combinations less frequently.
SERVER-SIDE VS CLIENT-SIDE
“Server-side” A/B testing tools run on the page’s server. The page is compiled by the server, and is presented to the browser in its finished form. In the same way that some PC software is only available for Windows or for Macs, some server-side software is only available for particular servers (such as PHP).
What is Split Testing? Split testing is exactly the same as A/B testing, except you create separate URLs for each version of the page.
What is a Treatment in A/B testing? Your Treatment is the edited version of your page. It is the B version in a classic A/B test.
What is Uplift in A/B testing? Most A/B tests compare the conversion rate of one page against another. When your “B” page (the edited version) converts more frequently than your original page, the improvement is called “Uplift”. It is the percentage increase in your Conversion Rate.