When gearing up to run a test on your website it’s critical to have an understanding of the errors you may come across when analyzing the results of your experiment. Why? Because in statistical hypothesis testing, no test can ever be 100% decisive.
When running a website test, we seek out statistically significant results, meaning that the results of the test are not due to chance. The practical purpose of this is to allow for the results to be attributed to a specific cause (e.g. a change made to our site) with a high level of confidence.
When testing, marketers should aim for statistical significance at a P-value of 5%, which means that there is only a 5% chance that a test has produced incorrect results, and a confidence level of 95% that the results are correct. This is the threshold implied when “statistical significance” is used throughout this article.
Even though hypothesis tests are considered to be reliable, because of statistical significance variance errors can still occur leading to false positives and false negatives. The two types of errors are called type 1 and type 2– keep reading to learn what these errors are and how to avoid them.
A type 1 error is when you reach a false positive aka when you reject your null hypothesis because you believe your test made a difference when it really didn’t. Sometimes a false positive can occur randomly (e.g. it falls in the 5% of statistical significance variance), or there may be another variable that you didn’t originally account for that affects the outcome.
You’ve chosen to run an A/B test on your ecommerce website over a period of time that overlaps with a winter holiday. Because online shoppers’ habits differ during this period of time, you may be led to believe that a certain variation is a winner. In reality however, had you run the test during a more steady period of time, the results may have shown little to no change.
Unfortunately, there is no way to completely avoid type 1 errors. That said, here are a few tips you can implement to reduce the likelihood of a type 1 error during your next website test:
A type 2 error is essentially a false negative, meaning you’ve accepted the null hypothesis when there is a difference between the control group (null hypothesis) and the variation. This can occur when you don’t have a large enough sample size or your statistical power isn’t high enough.
Unlike a type 1 error, type 2 errors can have serious ramifications for your experimentation program. You’re not only missing out on learnings and valuable customer insights but, more importantly, a type 2 error could potentially send your testing roadmap in the wrong direction.
Let’s just say that you’re interested in running an A/B test on your B2B website to increase the number of demo requests your company receives. In the variation, you’ve chosen to change the color of the demo request button on your homepage from blue to green. After running the test for four days you see no clear winner and stop the test.
The following quarter you try the test out again, except this time you leave the test running for fourteen days, covering two full business cycles. Much to your surprise, this time around the green button is the clear winner!
What happened? Likely, the first time you ran the test you encountered a false negative because your sample size was not large enough.
Like type 1 errors, it is not possible to entirely eradicate the possibility of encountering a type 2 error in your website tests. But, there are ways to reduce the likelihood of type 2 errors, here’s how:
Understanding how these errors function in the world of statistical hypothesis testing will allow you to keep an informed and watchful eye over every website test your run. Happy testing!