T-Test Calculator
Perform one-sample, two-sample (independent), or paired t-tests. Calculate t-statistics, p-values, degrees of freedom, and confidence intervals. See also our Z-Test Calculator, P-Value Calculator, and Confidence Interval Calculator.
How to Use the T-Test Calculator
The Student's t-test is used to determine whether there is a statistically significant difference between means. Choose one-sample to compare a sample mean against a known population mean, two-sample (independent) to compare means from two separate groups, or paired to compare means from the same group measured twice (before/after studies).
Enter your summary statistics: sample mean, standard deviation, and sample size. For two-sample tests, provide statistics for both groups. For paired tests, enter the mean difference, standard deviation of differences, and number of pairs. Select your significance level and tail type, then click Calculate to get the t-statistic, p-value, and confidence interval.
The t-test assumes approximately normal data (robust for n > 30 by CLT), independent observations, and for the two-sample test, the Welch approximation handles unequal variances. If your sample size is large (n > 30) and population standard deviation is known, consider using a z-test instead.
Formula
One-Sample t-test:
t = (x̄ - μ₀) / (s / √n)
df = n - 1
Two-Sample t-test (Welch's):
t = (x̄₁ - x̄₂) / √(s₁²/n₁ + s₂²/n₂)
df = (s₁²/n₁ + s₂²/n₂)² / [(s₁²/n₁)²/(n₁-1) + (s₂²/n₂)²/(n₂-1)]
Paired t-test:
t = d̄ / (s_d / √n)
df = n - 1
Confidence Interval:
CI = (x̄₁ - x̄₂) ± t*(α/2, df) × SE
Example Calculation
A researcher tests whether a sample of 30 students has a mean IQ different from 100:
Given: x̄ = 105, s = 15, n = 30, μ₀ = 100
SE = 15 / √30 = 15 / 5.477 = 2.739
t = (105 - 100) / 2.739 = 1.826
df = 30 - 1 = 29
p-value (two-tailed) ≈ 0.0783
At α = 0.05: Fail to reject H₀ (p > 0.05)
95% CI: [−0.370, 10.370]
The evidence is insufficient to conclude the mean differs from 100.
T Critical Values Reference Table
| df | t₀.₁₀ | t₀.₀₅ | t₀.₀₂₅ | t₀.₀₁ |
|---|---|---|---|---|
| 1 | 6.314 | 12.706 | 31.821 | 63.657 |
| 2 | 2.920 | 4.303 | 6.965 | 9.925 |
| 5 | 2.015 | 2.571 | 3.365 | 4.032 |
| 10 | 1.812 | 2.228 | 2.764 | 3.169 |
| 15 | 1.753 | 2.131 | 2.602 | 2.947 |
| 20 | 1.725 | 2.086 | 2.528 | 2.845 |
| 25 | 1.708 | 2.060 | 2.485 | 2.787 |
| 30 | 1.697 | 2.042 | 2.457 | 2.750 |
| 60 | 1.671 | 2.000 | 2.390 | 2.660 |
| ∞ | 1.645 | 1.960 | 2.326 | 2.576 |
Step-by-Step Decision Process
- State hypotheses: H0: There is no difference (means are equal). H1: There is a difference (one-tailed or two-tailed).
- Check assumptions: Data are approximately normal (or n > 30), observations are independent, continuous data.
- Choose the test type: One-sample (compare to known value), two-sample independent (compare two groups), or paired (before/after on same subjects).
- Set significance level: Typically alpha = 0.05. Choose before collecting data.
- Calculate test statistic: t = (observed difference - hypothesized difference) / standard error.
- Find p-value: Compare t-statistic to the t-distribution with appropriate degrees of freedom.
- Make decision: If p < alpha, reject H0. If p ≥ alpha, fail to reject H0.
- Report effect size: Calculate Cohen's d = (mean difference) / pooled SD. Small: 0.2, Medium: 0.5, Large: 0.8.
Additional Solved Examples
Example: Paired T-Test for Blood Pressure Medication
A study measures systolic blood pressure before and after a new medication in 12 patients. The mean reduction is 8.5 mmHg with SD of differences = 6.2. Does the medication significantly lower blood pressure at alpha = 0.05?
H0: mean_d = 0 (no reduction)
H1: mean_d > 0 (reduction exists, one-tailed)
t = d_bar / (s_d / sqrt(n)) = 8.5 / (6.2/sqrt(12))
t = 8.5 / 1.790 = 4.749
df = 12 - 1 = 11
t_critical (0.05, 11, one-tail) = 1.796
Since 4.749 > 1.796, reject H0
p-value < 0.001
Cohen's d = 8.5/6.2 = 1.37 (large effect)
Answer: The medication produces a statistically significant reduction in blood pressure (t(11) = 4.75, p < 0.001). The effect size (d = 1.37) indicates a large clinical effect.
Example: Two-Sample T-Test for Teaching Methods
Method A (n=18): mean = 74.2, SD = 8.5. Method B (n=22): mean = 79.8, SD = 9.1. Is Method B significantly better at alpha = 0.05?
SE = sqrt(8.5^2/18 + 9.1^2/22) = sqrt(4.014 + 3.764) = sqrt(7.778) = 2.789
t = (79.8 - 74.2) / 2.789 = 5.6 / 2.789 = 2.008
Welch df = (7.778)^2 / (4.014^2/17 + 3.764^2/21) = 60.49/1.622 = 37.3
t_critical (0.05, 37, two-tail) = 2.026
Since 2.008 < 2.026, fail to reject H0 (barely)
p-value = 0.052
Answer: The difference is not statistically significant at alpha = 0.05 (t(37) = 2.01, p = 0.052), though it is borderline. With a larger sample, this difference might reach significance. Report the effect size: d = 5.6/8.83 = 0.63 (medium).
Interpreting Results
How to Write Up T-Test Results
A proper report includes: test type, t-statistic, degrees of freedom, p-value, confidence interval, and effect size.
Good example: "An independent samples t-test revealed a significant difference between groups, t(38) = 2.45, p = 0.019, 95% CI [0.82, 8.74], d = 0.77."
Poor example: "The p-value was 0.019 so the result is significant." (Missing context, effect size, and CI.)
Statistical vs Practical Significance
A result can be statistically significant but practically meaningless. With n = 10,000, even a 0.1-point difference in test scores might be "significant" (p < 0.05) but too small to matter educationally. Always report effect size (Cohen's d) alongside the p-value to assess whether the difference is meaningful in context.
Key Takeaways
- The t-test compares means when population standard deviation is unknown (which is nearly always in practice).
- Use Welch's t-test for two independent samples - it handles unequal variances and is the safer default.
- The t-distribution approaches the normal distribution as degrees of freedom increase (essentially identical at df > 30).
- Always report confidence intervals and effect sizes alongside p-values for complete interpretation.
- A non-significant result does not prove the null hypothesis - it means insufficient evidence to reject it.
Frequently Asked Questions
What is a t-test?
A t-test is a statistical hypothesis test used to determine whether there is a significant difference between the means of one or two groups. It uses the t-distribution, which accounts for the extra uncertainty when the population standard deviation is unknown and must be estimated from the sample. The test produces a t-statistic and p-value that indicate whether the observed difference is likely due to chance.
When should I use a t-test vs a z-test?
Use a t-test when the population standard deviation is unknown (estimated from sample data) or when sample sizes are small (n < 30). Use a z-test when the population standard deviation is known and the sample size is large. In practice, the t-test is almost always preferred because population σ is rarely known. As sample size increases, the t-distribution approaches the standard normal distribution.
What is the difference between one-tailed and two-tailed tests?
A two-tailed test checks for any difference (H₁: μ ≠ μ₀), while a one-tailed test checks for a specific direction (H₁: μ > μ₀ or H₁: μ < μ₀). Two-tailed tests are more conservative — the p-value is double that of a one-tailed test. Use one-tailed only when you have a strong prior reason to test in one direction and would not act on a difference in the other direction.
What assumptions does the t-test require?
The t-test assumes: (1) data are continuous, (2) observations are independent, (3) data are approximately normally distributed (less critical for n > 30 due to CLT), and (4) for two-sample tests, groups are independent. Welch's t-test does not require equal variances. Violations of normality can be addressed with non-parametric alternatives like the Mann-Whitney U test or Wilcoxon signed-rank test.
What is Welch's t-test?
Welch's t-test is a modification of the two-sample t-test that does not assume equal variances between groups. It adjusts the degrees of freedom using the Welch-Satterthwaite equation, resulting in a non-integer df. It is generally recommended over the pooled (Student's) t-test because it performs well regardless of whether variances are equal, with minimal loss of power when they are.
How do I interpret the p-value from a t-test?
The p-value is the probability of observing a t-statistic as extreme as (or more extreme than) the one calculated, assuming the null hypothesis is true. If p < α (typically 0.05), reject H₀ and conclude the difference is statistically significant. A small p-value does not indicate a large effect — always report effect size (Cohen's d) and confidence intervals alongside p-values for complete interpretation.