Friedman Test: Full Guide with Example

The Friedman test is a non-parametric statistical test used to detect differences in treatments across multiple test attempts when the data are related (e.g., repeated measures on the same subjects or matched blocks). It is particularly useful when the assumptions for parametric tests like repeated-measures ANOVA (e.g., normality) are not met, or when the data are ordinal.

Assumptions of the Friedman Test:

Paired or Matched Data: The data must be organized in blocks (or subjects) where each block undergoes all treatment conditions.
Ordinal or Continuous Data: The dependent variable should be measured on an ordinal scale or be continuous.
Three or More Groups/Conditions: The test is designed for comparing three or more related groups. For two related groups, the Wilcoxon Signed-Rank Test is more appropriate.
Independent Blocks: The observations within each block are related, but the blocks themselves must be independent of each other.
No Interaction between Blocks and Treatments: This means that the effect of a treatment is consistent across all blocks.

Hypotheses:

Null Hypothesis ($H_0$): The distributions of the dependent variable are the same across all treatment conditions. (i.e., there is no significant difference between the median treatment effects).
Alternative Hypothesis ($H_1$):): At least one treatment condition has a different distribution from the others. (i.e., there is a significant difference in treatment effects).

Steps of the Friedman Test:

Step 1: Organize Your Data

Arrange your data in a table where rows represent the subjects (or blocks) and columns represent the different treatment conditions.

Step 2: Rank the Data within Each Block

For each subject (row), rank the observations from lowest to highest across the different treatment conditions (columns). Assign a rank of 1 to the smallest value, 2 to the next smallest, and so on. If there are ties within a row, assign the average of the ranks that would have been assigned.

Step 3: Calculate the Sum of Ranks for Each Treatment

Sum the ranks for each treatment condition (column). Let $R_j$ be the sum of ranks for the $j$-th treatment.

Step 4: Calculate the Friedman Test Statistic ($\chi^2_F$)

The traditional Friedman test statistic, often denoted as $\chi^2_F$ or $Q$, is calculated using the following formula:

$$\chi^2_F = \frac{12}{nk(k+1)} \sum_{j=1}^{k} R_j^2 - 3n(k+1)$$

$n$ = number of subjects (blocks)
$k$ = number of treatment conditions
$R_j$ = sum of ranks for the $j$-th treatment

This statistic approximates a chi-square distribution with $df = k-1$ degrees of freedom.

Step 5: Calculate the F-statistic ($F_F$) (Conover's F-statistic approximation)

An F-statistic approximation can also be used, which follows an F-distribution with degrees of freedom $d1 = k-1$ and $d2 = (n-1)(k-1)$.

$$F_F = \frac{(n-1)\chi^2_F}{n(k-1) - \chi^2_F}$$

Step 6: Determine the Critical Value or P-value

Compare your calculated statistic ($\chi^2_F$ or $F_F$) to the appropriate critical value from a chi-square or F-distribution table, or obtain the p-value using statistical software.

Step 7: Make a Decision

If the calculated test statistic is greater than the critical value, or if the p-value is less than the significance level ($\alpha$), reject the null hypothesis.
Otherwise, fail to reject the null hypothesis.

Step 8: Interpret the Results

If you reject the null hypothesis, it means there is a statistically significant difference between at least two of the treatment conditions. Post-hoc tests are needed to identify which specific pairs differ.

Example: Efficacy of Different Pain Relief Methods

A pharmaceutical company wants to compare the effectiveness of three different pain relief methods (A, B, C) for chronic back pain. They recruit 10 patients, and each patient tries all three methods. Patients rate their pain on a scale of 1 to 10 (1 = no pain, 10 = extreme pain).

Significance Level ($\alpha$): 0.05

Original Patient Pain Ratings:

Patient	Method A	Method B	Method C
1	7	5	4
2	8	6	7
3	6	4	3
4	9	7	6
5	5	3	2
6	7	6	5
7	8	5	4
8	6	5	3
9	7	4	3
10	8	6	4

Step 1: Study Information

Number of Subjects (n): 10
Number of Treatment Conditions (k): 3

Step 2: Ranks within Each Patient (Row)

Patient	Method A (Rank)	Method B (Rank)	Method C (Rank)
1	3	2	1
2	3	1	2
3	3	2	1
4	3	2	1
5	3	2	1
6	3	2	1
7	3	2	1
8	3	2	1
9	3	2	1
10	3	2	1

Step 3: Sum of Ranks for Each Treatment ($R_j$)

Treatment	Sum of Ranks (R_j)
Method A	30
Method B	19
Method C	11

Check: $30 + 19 + 11 = 60$. Expected sum: $n \times k(k+1)/2 = 10 \times 3(4)/2 = 60$. (Correct)

Step 4: Calculate $\chi^2_F$

$$\chi^2_F = \frac{12}{10 \times 3 \times (3+1)} (30^2 + 19^2 + 11^2) - 3 \times 10 \times (3+1)$$

$$\chi^2_F = \frac{12}{120} (900 + 361 + 121) - 120$$

$$\chi^2_F = 0.1 \times (1382) - 120 = 138.2 - 120 = \mathbf{18.2}$$

Step 5: Calculate $F_F$

Degrees of Freedom: $d1 = k-1 = 3-1 = 2$, $d2 = (n-1)(k-1) = (10-1)(3-1) = 9 \times 2 = 18$

$$F_F = \frac{(10-1) \times 18.2}{10(3-1) - 18.2}$$

$$F_F = \frac{9 \times 18.2}{20 - 18.2} = \frac{163.8}{1.8} = \mathbf{91.0}$$

Step 6 & 7: Determine P-values and Make Decision

For $\chi^2_F = 18.2$:
- Degrees of Freedom ($df$): 2
- Critical $\chi^2$ value ($\alpha=0.05, df=2$): 5.991
- Approximate p-value: 0.00011
- Decision: Since $18.2 > 5.991$ (or p-value $0.00011 < 0.05$), we reject the null hypothesis.
For $F_F = 91.0$:
- Degrees of Freedom ($d1, d2$): (2, 18)
- Critical F value ($\alpha=0.05, d1=2, d2=18$): 3.55
- Approximate p-value: < 0.001
- Decision: Since $91.0 > 3.55$ (or p-value $\ll 0.001 < 0.05$), we reject the null hypothesis.

Step 8: Interpret the Results

Based on the Friedman test (p-value $\approx 0.0001$), we reject the null hypothesis. There is a statistically significant difference in pain relief among the three methods (A, B, and C).

Post-Hoc Analysis (Nemenyi's Test Example)

Since the overall test was significant, we conduct Nemenyi's post-hoc test to see which pairs differ.

Critical Difference (CD) for $\alpha=0.05, k=3, d2=18$: $\approx 1.126$
Mean Ranks: Method A = 3.0, Method B = 1.9, Method C = 1.1

Comparison	Absolute Difference in Mean Ranks	Significance (Difference > CD)
Method A vs. Method B	$\|3.0 - 1.9\| = 1.1$	No
Method A vs. Method C	$\|3.0 - 1.1\| = 1.9$	Yes
Method B vs. Method C	$\|1.9 - 1.1\| = 0.8$	No

Conclusion from Post-Hoc: Only Method A and Method C show a statistically significant difference in pain relief. Method C provides significantly more pain relief than Method A.