Create A/B tests by chatting with AI and launch them on your website within minutes.

Try it for FREE now

P-hacking

Quick answer

P-hacking, also known as data dredging, is a method in which data is manipulated or selection criteria are modified until a desired statistical result, typically a statistically significant result, is achieved. It involves testing numerous hypotheses on a particular dataset until the data appears to support one.

Key takeaways

  • P-hacking helps evaluate whether an experiment result is reliable enough to act on.
  • It should be reviewed together with sample size, duration, effect size, and business impact.
  • It is most useful when the hypothesis and primary metric are defined before the test starts.

Definition

P-hacking, also known as data dredging, is a method in which data is manipulated or selection criteria are modified until a desired statistical result, typically a statistically significant result, is achieved. It involves testing numerous hypotheses on a particular dataset until the data appears to support one. This practice can lead to misleading findings or exaggerated statistical significance.

What P-hacking means in A/B testing

In an A/B testing workflow, P-hacking is part of the statistical layer that helps explain whether a result is trustworthy. It is most useful when paired with a clear hypothesis, a primary metric, enough traffic, and a pre-defined decision rule.

Why P-hacking matters

P-hacking matters because it helps teams separate real experiment signals from random noise. It should be interpreted alongside sample size, test duration, traffic quality, and the business value of the metric being measured.

Example of P-hacking

For example, a team testing a new pricing-page headline may see a higher sign-up rate in the variant. P-hacking helps the team judge whether that lift is strong enough to trust or whether they should keep collecting data before making a decision.

How to use P-hacking

Use P-hacking after you have chosen a primary metric and collected enough traffic for a reliable read. Avoid checking it in isolation; compare it with effect size, confidence, practical impact, and whether the test ran long enough to cover normal traffic patterns.

Common mistake

A common mistake is treating P-hacking as a yes-or-no shortcut while ignoring sample size, test duration, and practical business impact. A statistically interesting result can still be too small, too noisy, or too risky to ship.

Related A/B testing terms

FAQ

What does p-hacking mean in A/B testing?

P-hacking, also known as data dredging, is a method in which data is manipulated or selection criteria are modified until a desired statistical result, typically a statistically significant result, is achieved. It involves testing numerous hypotheses on a particular dataset until the data appears to support one.

Why does p-hacking matter for experiments?

P-hacking matters because it helps teams separate real experiment signals from random noise. It should be interpreted alongside sample size, test duration, traffic quality, and the business value of the metric being measured.

How should teams use p-hacking in an experiment?

Use P-hacking after you have chosen a primary metric and collected enough traffic for a reliable read. Avoid checking it in isolation; compare it with effect size, confidence, practical impact, and whether the test ran long enough to cover normal traffic patterns.

Download our free 100 point Ecommerce CRO Checklist

This comprehensive checklist covers all critical pages, from homepage to checkout, giving you actionable steps to boost sales and revenue.