Q&A 12 How do you statistically compare OTU richness between groups?
12.1 Explanation
Once you’ve calculated alpha diversity (e.g., observed OTUs per sample), it’s common to test whether groups differ significantly. This can help you determine if an experimental condition affects microbial richness.
Typical tests include: - T-test (for two groups, normally distributed data) - Wilcoxon rank-sum test (non-parametric) - ANOVA or Kruskal-Wallis (for 3+ groups)
Here we demonstrate group comparison for richness using appropriate statistical tests.
12.2 Python Code
import pandas as pd
from scipy.stats import ttest_ind, mannwhitneyu
# Load richness and metadata
otu_df = pd.read_csv("data/otu_table_filtered.tsv", sep="\t", index_col=0)
meta_df = pd.read_csv("data/sample_metadata.tsv", sep="\t")
# Compute richness
richness = pd.DataFrame({
"sample_id": otu_df.columns,
"richness": (otu_df > 0).sum(axis=0).values
})
data = pd.merge(richness, meta_df, on="sample_id")
# Split by group
control = data[data["group"] == "Control"]["richness"]
treatment = data[data["group"] == "Treatment"]["richness"]
# T-test (assumes normality)
t_stat, t_pval = ttest_ind(control, treatment)
# Wilcoxon (non-parametric)
w_stat, w_pval = mannwhitneyu(control, treatment)
print(f"T-test p-value: {t_pval:.4f}")
print(f"Wilcoxon test p-value: {w_pval:.4f}")
12.3 R Code
library(tidyverse)
otu_df <- read.delim("data/otu_table_filtered.tsv", row.names = 1)
meta_df <- read.delim("data/sample_metadata.tsv")
# Calculate richness
richness <- colSums(otu_df > 0)
richness_df <- data.frame(sample_id = names(richness), richness = richness)
merged <- left_join(richness_df, meta_df, by = "sample_id")
# T-test
t_test <- t.test(richness ~ group, data = merged)
# Wilcoxon test
wilcox_test <- wilcox.test(richness ~ group, data = merged)
t_test$p.value
[1] 0.2666018
[1] 0.2447901