T-Test Calculator

Perform one-sample, two-sample (independent), or paired t-tests. Calculate t-statistics, p-values, degrees of freedom, and confidence intervals. See also our Z-Test Calculator, P-Value Calculator, and Confidence Interval Calculator.

Test Type

Tail Type

Sample Mean

Sample Std Dev

Sample Size (n)

Population Mean (μ₀)

Significance Level (α)

How to Use the T-Test Calculator

The Student's t-test is used to determine whether there is a statistically significant difference between means. Choose one-sample to compare a sample mean against a known population mean, two-sample (independent) to compare means from two separate groups, or paired to compare means from the same group measured twice (before/after studies).

Enter your summary statistics: sample mean, standard deviation, and sample size. For two-sample tests, provide statistics for both groups. For paired tests, enter the mean difference, standard deviation of differences, and number of pairs. Select your significance level and tail type, then click Calculate to get the t-statistic, p-value, and confidence interval.

The t-test assumes approximately normal data (robust for n > 30 by CLT), independent observations, and for the two-sample test, the Welch approximation handles unequal variances. If your sample size is large (n > 30) and population standard deviation is known, consider using a z-test instead.

Formula

One-Sample t-test:

t = (x̄ - μ₀) / (s / √n)

df = n - 1

Two-Sample t-test (Welch's):

t = (x̄₁ - x̄₂) / √(s₁²/n₁ + s₂²/n₂)

df = (s₁²/n₁ + s₂²/n₂)² / [(s₁²/n₁)²/(n₁-1) + (s₂²/n₂)²/(n₂-1)]

Paired t-test:

t = d̄ / (s_d / √n)

df = n - 1

Confidence Interval:

CI = (x̄₁ - x̄₂) ± t*(α/2, df) × SE

Example Calculation

A researcher tests whether a sample of 30 students has a mean IQ different from 100:

Given: x̄ = 105, s = 15, n = 30, μ₀ = 100

SE = 15 / √30 = 15 / 5.477 = 2.739

t = (105 - 100) / 2.739 = 1.826

df = 30 - 1 = 29

p-value (two-tailed) ≈ 0.0783

At α = 0.05: Fail to reject H₀ (p > 0.05)

95% CI: [−0.370, 10.370]

The evidence is insufficient to conclude the mean differs from 100.

T Critical Values Reference Table

df	t₀.₁₀	t₀.₀₅	t₀.₀₂₅	t₀.₀₁
1	6.314	12.706	31.821	63.657
2	2.920	4.303	6.965	9.925
5	2.015	2.571	3.365	4.032
10	1.812	2.228	2.764	3.169
15	1.753	2.131	2.602	2.947
20	1.725	2.086	2.528	2.845
25	1.708	2.060	2.485	2.787
30	1.697	2.042	2.457	2.750
60	1.671	2.000	2.390	2.660
∞	1.645	1.960	2.326	2.576

Step-by-Step Decision Process

State hypotheses: H0: There is no difference (means are equal). H1: There is a difference (one-tailed or two-tailed).
Check assumptions: Data are approximately normal (or n > 30), observations are independent, continuous data.
Choose the test type: One-sample (compare to known value), two-sample independent (compare two groups), or paired (before/after on same subjects).
Set significance level: Typically alpha = 0.05. Choose before collecting data.
Calculate test statistic: t = (observed difference - hypothesized difference) / standard error.
Find p-value: Compare t-statistic to the t-distribution with appropriate degrees of freedom.
Make decision: If p < alpha, reject H0. If p ≥ alpha, fail to reject H0.
Report effect size: Calculate Cohen's d = (mean difference) / pooled SD. Small: 0.2, Medium: 0.5, Large: 0.8.

Additional Solved Examples

Example: Paired T-Test for Blood Pressure Medication

A study measures systolic blood pressure before and after a new medication in 12 patients. The mean reduction is 8.5 mmHg with SD of differences = 6.2. Does the medication significantly lower blood pressure at alpha = 0.05?

H0: mean_d = 0 (no reduction)

H1: mean_d > 0 (reduction exists, one-tailed)

t = d_bar / (s_d / sqrt(n)) = 8.5 / (6.2/sqrt(12))

t = 8.5 / 1.790 = 4.749

df = 12 - 1 = 11

t_critical (0.05, 11, one-tail) = 1.796

Since 4.749 > 1.796, reject H0

p-value < 0.001

Cohen's d = 8.5/6.2 = 1.37 (large effect)

Answer: The medication produces a statistically significant reduction in blood pressure (t(11) = 4.75, p < 0.001). The effect size (d = 1.37) indicates a large clinical effect.

Example: Two-Sample T-Test for Teaching Methods

Method A (n=18): mean = 74.2, SD = 8.5. Method B (n=22): mean = 79.8, SD = 9.1. Is Method B significantly better at alpha = 0.05?

SE = sqrt(8.5^2/18 + 9.1^2/22) = sqrt(4.014 + 3.764) = sqrt(7.778) = 2.789

t = (79.8 - 74.2) / 2.789 = 5.6 / 2.789 = 2.008

Welch df = (7.778)^2 / (4.014^2/17 + 3.764^2/21) = 60.49/1.622 = 37.3

t_critical (0.05, 37, two-tail) = 2.026

Since 2.008 < 2.026, fail to reject H0 (barely)

p-value = 0.052

Answer: The difference is not statistically significant at alpha = 0.05 (t(37) = 2.01, p = 0.052), though it is borderline. With a larger sample, this difference might reach significance. Report the effect size: d = 5.6/8.83 = 0.63 (medium).

Interpreting Results

How to Write Up T-Test Results

A proper report includes: test type, t-statistic, degrees of freedom, p-value, confidence interval, and effect size.

Good example: "An independent samples t-test revealed a significant difference between groups, t(38) = 2.45, p = 0.019, 95% CI [0.82, 8.74], d = 0.77."

Poor example: "The p-value was 0.019 so the result is significant." (Missing context, effect size, and CI.)

Statistical vs Practical Significance

A result can be statistically significant but practically meaningless. With n = 10,000, even a 0.1-point difference in test scores might be "significant" (p < 0.05) but too small to matter educationally. Always report effect size (Cohen's d) alongside the p-value to assess whether the difference is meaningful in context.

Key Takeaways

The t-test compares means when population standard deviation is unknown (which is nearly always in practice).
Use Welch's t-test for two independent samples - it handles unequal variances and is the safer default.
The t-distribution approaches the normal distribution as degrees of freedom increase (essentially identical at df > 30).
Always report confidence intervals and effect sizes alongside p-values for complete interpretation.
A non-significant result does not prove the null hypothesis - it means insufficient evidence to reject it.

Frequently Asked Questions

What is a t-test?

A t-test is a statistical hypothesis test used to determine whether there is a significant difference between the means of one or two groups. It uses the t-distribution, which accounts for the extra uncertainty when the population standard deviation is unknown and must be estimated from the sample. The test produces a t-statistic and p-value that indicate whether the observed difference is likely due to chance.

When should I use a t-test vs a z-test?

Use a t-test when the population standard deviation is unknown (estimated from sample data) or when sample sizes are small (n < 30). Use a z-test when the population standard deviation is known and the sample size is large. In practice, the t-test is almost always preferred because population σ is rarely known. As sample size increases, the t-distribution approaches the standard normal distribution.

What is the difference between one-tailed and two-tailed tests?

A two-tailed test checks for any difference (H₁: μ ≠ μ₀), while a one-tailed test checks for a specific direction (H₁: μ > μ₀ or H₁: μ < μ₀). Two-tailed tests are more conservative — the p-value is double that of a one-tailed test. Use one-tailed only when you have a strong prior reason to test in one direction and would not act on a difference in the other direction.

What assumptions does the t-test require?

The t-test assumes: (1) data are continuous, (2) observations are independent, (3) data are approximately normally distributed (less critical for n > 30 due to CLT), and (4) for two-sample tests, groups are independent. Welch's t-test does not require equal variances. Violations of normality can be addressed with non-parametric alternatives like the Mann-Whitney U test or Wilcoxon signed-rank test.

What is Welch's t-test?

Welch's t-test is a modification of the two-sample t-test that does not assume equal variances between groups. It adjusts the degrees of freedom using the Welch-Satterthwaite equation, resulting in a non-integer df. It is generally recommended over the pooled (Student's) t-test because it performs well regardless of whether variances are equal, with minimal loss of power when they are.

How do I interpret the p-value from a t-test?

The p-value is the probability of observing a t-statistic as extreme as (or more extreme than) the one calculated, assuming the null hypothesis is true. If p < α (typically 0.05), reject H₀ and conclude the difference is statistically significant. A small p-value does not indicate a large effect — always report effect size (Cohen's d) and confidence intervals alongside p-values for complete interpretation.

Related Calculators

Z-Test P-Value Confidence Interval ANOVA Standard Deviation Hypothesis Testing