Jamie.FitzGerald -a- MTU.ie

Big fan of Kraken2(+Bracken!), but the existing Silva 138 database creates issues with assignment to species level (Silva gives each species the taxid from it’s parent genus - so Kraken2 can only go as far as genus). GreenGenes2 requires an even larger degree of wriggling in order to parse the database.
Here we work around those issues for anyone working on 16S assignment:
code & walkthrough: approach and code for for making species-level Kraken2 databases (v2024.09).code only: sparse code for making species-level Kraken2 databases (v138.2).NB: both Silva and Greengenes2 use taxids that do not match NCBI taxids. The taxonomies (names!) are fine, and if you are not using taxids, this should not matter to you at all. For Silva 138.2, we assign new (fake) taxids to all ranks below genus (i.e. species, strains, subspecies, submarines, substrains). With Greengenes, this code assigns completely new (fake) taxids to all ranks.
See also a list of other possible issues with this approach, and feel free to list your own, like my concern about reference length…
CLI, Microbial Ecology, and R - CLIMBER guide to Microbial EcologySome basic steps in microbial ecology, focusing on the processing of 2ndGen Illumina fastq data, into either amplicon (e.g. 16S) or metagenomic (e.g. shotgun) datasets, followed by ecology-based analysis of the communities and patterns we find in that data. Now with less emojis.
check out
climber
You’d think this would be the first thing to go in, but no.
zCompositions::cMultRepl) in the CLR transformR ecology stuffCLI:entrez-directIt’s not any sheer to put underneath with an unless - see with and folder. But, did and with the place of not yet although with and also? - making presentation un up