Microbial Insights
Welcome to CDI – Unlocking Microbial Insights
📚 The CDI Learning Path
I DATA EXPLORATION
1
What are the essential tools for microbiome read quality control?
1.1
Explanation
1.2
Shell Code
1.3
R Note
2
How do you obtain example microbiome sequencing data for analysis?
2.1
Explanation
2.2
Shell Code
2.3
Python Note
2.4
R Note
3
How do you process raw sequencing data into a feature table using QIIME2?
3.1
Explanation
3.2
Shell Code (QIIME2 CLI)
3.3
Python Note
4
How do you process raw sequencing data into a feature table using Mothur?
4.1
Explanation
4.2
Shell Code
4.3
R Note
4.4
Python Note
5
How do you explore and summarize a microbiome OTU table?
5.1
Explanation
5.2
Python Code
5.3
R Code
6
How do you filter out low-abundance or low-prevalence OTUs?
6.1
Explanation
6.2
Python Code
6.3
R Code
II DATA VISUALIZATION
7
How do you visualize total OTU abundance per sample?
7.1
Explanation
7.2
Python Code
7.3
R Code
8
How do you create a stacked bar plot of top genera across samples?
8.1
Explanation
8.2
Python Code
8.3
R Code
9
How do you visualize alpha diversity (richness) across groups?
9.1
Explanation
9.2
Python Code
9.3
R Code
10
How do you perform ordination (e.g., PCA) to visualize sample clustering?
10.1
Explanation
10.2
Python Code
10.3
R Code
11
How do you visualize OTU or Genus abundance using a heatmap?
11.1
Explanation
11.2
Python Code
11.3
R Code
III STATISTICAL ANALYSIS
12
How do you statistically compare OTU richness between groups?
12.1
Explanation
12.2
Python Code
12.3
R Code
13
How do you test for correlation between alpha diversity and age?
13.1
Explanation
13.2
Python Code
13.3
R Code
14
How do you compare alpha diversity across 3 or more groups?
14.1
Explanation
14.2
Python Code
14.3
R Code
15
How do you test for differences in community composition using PERMANOVA?
15.1
Explanation
15.2
Python Code
15.3
R Code
16
How do you test for differential abundance of OTUs across groups?
16.1
Explanation
16.2
Python Code
16.3
R Code
IV MACHINE LEARNING
17
How do you prepare microbiome data for machine learning?
17.1
Explanation
17.2
Python Code
17.3
R Note
18
How do you train and evaluate a Random Forest classifier on microbiome data?
18.1
Explanation
18.2
Python Code
18.3
R Code (caret)
19
How do you build a Logistic Regression model for microbiome classification?
19.1
Explanation
19.2
Python Code
19.3
R Code (caret)
20
How do you train a Support Vector Machine (SVM) for microbiome classification?
20.1
Explanation
20.2
Python Code
20.3
R Code (caret)
21
How do you apply Gradient Boosting (XGBoost) for microbiome classification?
21.1
Explanation
21.2
Python Code
21.3
R Code (caret + xgboost)
22
How do you visualize ROC curves to compare classification models?
22.1
Explanation
22.2
Python Code
22.3
R Code (caret + pROC)
23
How do you apply cross-validation strategies to evaluate model reliability?
23.1
Explanation
23.2
Python Code
23.3
R Code (caret with repeated k-fold CV)
24
How do you use
mikropml
in R for microbiome machine learning?
24.1
Explanation
24.2
R Code
24.3
Notes
APPENDIX
A
Microbiome Data Analysis Workflow
Explore More Guides
Unlocking Microbial Insights
Unlocking Microbial Insights
Last updated: June 22, 2025