Q&A 7 How do you visualize total OTU abundance per sample?
7.1 Explanation
Before diving into deeper microbiome comparisons, itβs helpful to visualize the sequencing depth β the total number of OTU counts per sample. This allows you to check: - Sample variability - Potential outliers - Overall library size distribution
Using modern tools like ggplot2 in R or seaborn in Python helps create clearer, more elegant plots.
7.2 Python Code
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Load OTU table
otu_df = pd.read_csv("data/otu_table_filtered.tsv", sep="\t", index_col=0)
# Prepare data
total_counts = otu_df.sum(axis=0).reset_index()
total_counts.columns = ["Sample", "Total_OTUs"]
# Plot
plt.figure(figsize=(10, 5))
sns.barplot(data=total_counts, x="Sample", y="Total_OTUs", palette="viridis")
plt.title("Total OTU Abundance Per Sample")
plt.xticks(rotation=45)
plt.ylabel("Total OTU Counts")
plt.tight_layout()
plt.show()
7.3 R Code
library(tidyverse)
otu_df <- read.delim("data/otu_table_filtered.tsv", row.names = 1)
otu_long <- colSums(otu_df) %>%
enframe(name = "Sample", value = "Total_OTUs")
ggplot(otu_long, aes(x = Sample, y = Total_OTUs)) +
geom_col(fill = "#0073C2FF") +
labs(title = "Total OTU Abundance Per Sample", y = "Total OTU Counts") +
theme_minimal(base_size = 13) +
theme(axis.text.x = element_text(angle = 45, hjust = 1))