In Null Hypothesis Significance Tests, the p-value is the probability of observing an effect larger than or equal to the measured metric delta, under the assumption that the null hypothesis is true. In practice, a p-value that’s lower than your pre-defined Type I Error threshold (α) is treated as evidence for there being a true effect.The methodology used for p-value calculation depends on the number of degrees of freedom (ν). A two-sample z-test is appropriate for most experiments. Welch’s t-test is used for smaller experiments with ν<100. In both cases, the p-value depends on the metric mean and variance computed for the test and control groups.
The z-statistic (a.k.a. z-score) of a two-sample z-test can be computed in multiple equivalent formats:Z=var(Xt)+var(Xc)Xt−Xc=var(ΔX)Xt−Xc=σXt2+σXc2Xt−Xcwhere:
Z is the observed z-statistic (not the z-critical value Zα/s)
var(ΔX) is the variance of the absolute delta of means
var(Xi) is the variance of sample means either control or treatment group (details here)
σXt is the standard error of the mean of either control or treatment group (these are the terms you can find in Pulse under the Statistics tab of a metric)
The two-sided p-value is obtained from the standard normal cumulative distribution function:p−value=2⋅2π1−∞∫−∣Z∣e−t2/2dt
For smaller sample sizes, Welch’s t-test is the preferred statistical test for lower false positive rates in cases of unequal sizes and variances. In Pulse, Welch’s t-test is automatically applied when the degrees of freedom ν<100.We compute the t-statistic (a.k.a. t-score) identically as the two-sample z-statistic above. Additionally, we compute the degrees of freedom ν using:ν=Nt−1var(Xt)2+Nc−1var(Xc)2(var(Xt)+var(Xc))2:=Nt−1var(Xt)2+Nc−1var(Xc)2var(ΔX)2The p-value is then obtained from the t-distribution with ν degrees of freedom.
The procedure for a one-sided z-test computes the z-statistic Z in the same way as a two-sided test above.The one-sided p-value is obtained from the standard normal cumulative distribution function as well, but with slight differences:p−value=⎩⎨⎧1−2π1−∞∫Ze−t2/2dt2π1−∞∫Ze−t2/2dtif right-hand testif left-hand testwhere:
Z is computed above in the two-sided test. Note that this uses the signed z-statistic, not the absolute value of the z-statistic as in the two-sided p-value.