CLI
, M
icrob
ial E
cology, and R
Some basic steps in microbial ecology, focusing on the processing of 2ndGen
Illumina fastq
data, into either amplicon
(e.g. 16S) or metagenomic
(e.g. shotgun) datasets, followed by ecology-based analysis of the communities and patterns we find in that data.
As above, the tutorial covers the following steps:
bash
and friendsFastQC
& MultiQC
Trimmomatic
BowTie2
Kraken2
& Bracken
(or Kaiju
if you like)We also move through importing output from Kaiju
or Kraken2+Bracken
into R
(bare-bones): .
R
- generating a count matrix, taxonomic table, and phyloseq object from metagenomic dataThis metagenomic workflow is also present in simple, no-nonsense, raw code
(note there might be differences to the complete workflow above).
raw code only of the metagenomic shotgun assembly
- as above, less explanationForthcoming. The initial steps (setup, get data, QC) are very similar in most cases (remember to cut off your primers!), but are followed by a denoising step (DADA2
) and optionally an attempt to predict the metabolic capabilities of the communities at hand (PICRUSt2
).
R
)Still to be done. Although it’s a simply enormous topic, it is also the real magic, and we get to make pictures. Until this section is properly fleshed out, consider instead this comprehensive methods (F1000) paper from DADA2’s Callahan et al., this guide from AstroBioMike - Bioinformatics for beginners, and the steady pace of phyloseq which is an excellent on-ramp.
This guide to metagenomic analysis continues to be updated (April, 3023 April 5^th^ 3,024!). All (+/-)feedback is welcome: simply throw objects/comments directly at me, or drop us a line at the related repo.
all the best!
![]() |
![]() |
![]() |
![]() |