Long-run frequency is a frequentist interpretation of probability that defines the likelihood of an event as the proportion of times it would occur if an experiment were repeated infinitely under identical conditions. It represents the observed frequency of outcomes over many trials rather than a subjective belief.
Long-run frequency is a frequentist interpretation of probability that defines the likelihood of an event as the proportion of times it would occur if an experiment were repeated infinitely under identical conditions. It represents the observed frequency of outcomes over many trials rather than a subjective belief.
This concept underlies frequentist statistical approaches commonly used in A/B testing, where probabilities are viewed as objective properties of repeatable experiments. For example, a 95% confidence interval means that if you ran the same test 100 times, approximately 95 of those intervals would contain the true parameter value. The interpretation strictly relates to repeated sampling, not to the probability of a single event.
Understanding long-run frequency is essential for correctly interpreting A/B test statistics like p-values and confidence intervals. It clarifies that statistical significance relates to what would happen across many repetitions of the experiment, not the probability that a specific result is true. This interpretation prevents common misunderstandings about what confidence levels actually mean in test results.
When an A/B testing tool reports a 95% confidence level, it means that if you repeated this exact test 100 times with different random samples, approximately 95 of those tests would correctly identify which variation is better. It does not mean there's a 95% probability that Variation B is the winner in this specific test.
Use Long-run Frequency after you have chosen a primary metric and collected enough traffic for a reliable read. Avoid checking it in isolation; compare it with effect size, confidence, practical impact, and whether the test ran long enough to cover normal traffic patterns.
A common mistake is treating Long-run Frequency as a yes-or-no shortcut while ignoring sample size, test duration, and practical business impact. A statistically interesting result can still be too small, too noisy, or too risky to ship.
Long-run frequency is a frequentist interpretation of probability that defines the likelihood of an event as the proportion of times it would occur if an experiment were repeated infinitely under identical conditions. It represents the observed frequency of outcomes over many trials rather than a subjective belief.
Understanding long-run frequency is essential for correctly interpreting A/B test statistics like p-values and confidence intervals. It clarifies that statistical significance relates to what would happen across many repetitions of the experiment, not the probability that a specific result is true. This interpretation prevents common misunderstandings about what confidence levels actually mean in test results.
Use Long-run Frequency after you have chosen a primary metric and collected enough traffic for a reliable read. Avoid checking it in isolation; compare it with effect size, confidence, practical impact, and whether the test ran long enough to cover normal traffic patterns.
This comprehensive checklist covers all critical pages, from homepage to checkout, giving you actionable steps to boost sales and revenue.