Day 3 - Ampliseq analysis with QIIME2

Input files

Dataset

Study: Human gut microbiome viewed across age and geography
Library type: Single End
Sample Included: 15
Subset: 10,000 reads per sample
Amplicon: V4

Manifest file

A tab-separated file containing details of the raw reads

For Single End reads

sample-id	absolute-filepath
sample_1	/path/to/the/F_read
sample_2	/path/to/the/F_read

For Paired End reads

sample-id	forward-absolute-filepath	reverse-absolute-filepath
sample_1	/path/to/the/F_read	/path/to/the/R_read
sample_2	/path/to/the/F_read	/path/to/the/R_read

Metadata file

Additional details of the samples in a tsv file format
Sample names should match with those in the manifest file

Taxonomy Classifier

silva_138_V4_classifier: A V4 region specific classifier of Silva DB processed with RESCRIPt

Importing data

Import read data into QIIME2 pipeline

$ mamba activate qiime2
(qiime2)$ qiime tools import --type 'SampleData[SequencesWithQuality]' \
                             --input-path manifest_1 \
                             --output-path 1_umv_import.qza \
                             --input-format SingleEndFastqManifestPhred64V2

For Phred33 reads

(qiime2)$ qiime tools import --type 'SampleData[SequencesWithQuality]' \
                             --input-path manifest_file \
                             --output-path samples_import.qza \
                             --input-format SingleEndFastqManifestPhred33V2

For Paired End reads

(qiime2)$ qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' \
                             --input-path manifest_file \
                             --output-path samples_import.qza \
                             --input-format PairedEndFastqManifestPhred33V2

Importing Phred64 reads will take more time as these are converted to Phred33 reads in the background

De-multiplexing

De-multiplexing pooled data - Requires barcode data
These samples are already de-multiplexed

Summarize

(qiime2)$ qiime demux summarize --i-data 1_umv_import.qza \
                                --o-visualization 1_umv_summary.qzv
# View summary
(qiime2)$ qiime tools view 1_umv_summary.qzv

Generating ASVs

If required, remove low quality regions of the sequences, based on the previous visualization 1_umv_summary.qzv.

Parameters for filtering:

–p-trim-left m (trims off the first m bases of each sequence)
–p-trunc-len n (truncates each sequence at position n)

(qiime2)$ qiime dada2 denoise-single --i-demultiplexed-seqs 1_umv_import.qza \
                                     --p-trunc-len 100 --p-n-threads 6 \
                                     --o-representative-sequences 2_umv_rep_seqs.qza \
                                     --o-table 2_umv_table.qza \
                                     --o-denoising-stats 2_umv_stats.qza

FeatureTable[Frequency]: contains count or frequencies of each unique sequence in each sample in the dataset (ASV table)
FeatureData[Sequence]: maps feature identifiers in the FeatureTable to the sequences they represent (Representative sequences file).

For Paired End samples

(qiime2)$ qiime dada2 denoise-paired --i-demultiplexed-seqs samples_import.qza \
                                     --p-trim-left-f 13 \
                                     --p-trim-left-r 13 \
                                     --p-trunc-len-f 200 \
                                     --p-trunc-len-r 150 \
                                     --o-table table.qza \
                                     --o-representative-sequences rep_seqs.qza \
                                     --o-denoising-stats denoising-stats.qza

Make sure there is at least 20 nucleotides overlap between the forward and reverse reads after trimming.

Visualizing the stats output

(qiime2)$ qiime metadata tabulate --m-input-file 2_umv_stats.qza \
                                  --o-visualization 2_umv_stats.qzv
# View the output
(qiime2)$ qiime tools view 2_umv_stats.qzv

Merging different runs or studies of same amplicon

Use the same trimming parameters

# Merge feature tables
$ qiime feature-table merge --i-tables table-1.qza \
                            --i-tables table-2.qza \
                            --o-merged-table table.qza
# Merge rep_seqs
$ qiime feature-table merge-seqs --i-data rep_seqs-1.qza \
                                 --i-data rep_seqs-2.qza \
                                 --o-merged-data rep_seqs.qza

Produces per sample summary of survived reads at each step of denoising.

Summarize

Summarizing feature table and representative sequences

# Feature table
(qiime2)$ qiime feature-table summarize --i-table 2_umv_table.qza \
                                        --o-visualization 2_umv_table.qzv \
                                        --m-sample-metadata-file metadata_1.tsv
# representative sequences
(qiime2)$ qiime feature-table tabulate-seqs --i-data 2_umv_rep_seqs.qza \
                                            --o-visualization 2_umv_rep_seqs.qzv

# View
(qiime2)$ qiime tools view 2_umv_table.qzv
(qiime2)$ qiime tools view 2_umv_rep_seqs.qzv

Taxonomy

Assign taxonomy to the representative sequences

Increase /tmp size to 50GB

(qiime2)$ qiime feature-classifier classify-sklearn --i-classifier silva138_AB_V4_classifier.qza \
                                                    --i-reads 2_umv_rep_seqs.qza \
                                                    --o-classification 3_umv_taxonomy.qza

View taxonomy output

(qiime2)$ qiime metadata tabulate --m-input-file 3_umv_taxonomy.qza \
                                  --o-visualization 3_umv_taxonomy.qzv
# View
(qiime2)$ qiime tools view 3_umv_taxonomy.qzv

Provides an output showing taxonomic composition for each Feature/ASV.

Taxa bar plot

To visualize taxonomic composition with interactive bar plots

(qiime2)$ qiime taxa barplot --i-table 2_umv_table.qza \
                             --i-taxonomy 3_umv_taxonomy.qza \
                             --m-metadata-file metadata_1.tsv \
                             --o-visualization 3_umv_taxa_plots.qzv
# View
(qiime2)$ qiime tools view 3_umv_taxa_plots.qzv

Phylogenetic Tree

(qiime2)$ qiime phylogeny align-to-tree-mafft-fasttree --i-sequences 2_umv_rep_seqs.qza \
                                                       --o-alignment 4_umv_aligned_rep_seqs.qza \
                                                       --o-masked-alignment 4_umv_masked_rep_seqs.qza \
                                                       --o-tree 4_umv_unrooted_tree.qza \
                                                       --o-rooted-tree 4_umv_rooted_tree.qza

Alpha and Beta diversity metrics

Rarefying a FeatureTable[Frequency] to user-specified sampling depth
Parameter to check: –p-sampling-depth: even sampling (i.e. rarefaction) depth
Decide by checking the table.qzv (2_umv_table.qzv) file.

(qiime2)$ qiime diversity core-metrics-phylogenetic --i-phylogeny 4_umv_rooted_tree.qza \
                                                    --i-table 2_umv_table.qza \
                                                    --p-sampling-depth 6162 \
                                                    --m-metadata-file metadata_1.tsv \
                                                    --output-dir 5_umv_core_metrics

Faith’s PD

A qualitative measure of community richness that incorporates phylogenetic relationships between the features

(qiime2)$ qiime diversity alpha-group-significance \
                          --i-alpha-diversity 5_umv_core_metrics/faith_pd_vector.qza \
                          --m-metadata-file metadata_1.tsv \
                          --o-visualization 5_umv_core_metrics/faith_pd_group_sig.qzv

Shannon’s diversity index

(qiime2)$ qiime diversity alpha-group-significance \
                          --i-alpha-diversity 5_umv_core_metrics/shannon_vector.qza \
                          --m-metadata-file metadata_1.tsv \
                          --o-visualization 5_umv_core_metrics/shannon_group_sig.qzv

Evenness

A measure of community evenness

(qiime2)$ qiime diversity alpha-group-significance \
                          --i-alpha-diversity 5_umv_core_metrics/evenness_vector.qza \
                          --m-metadata-file metadata_1.tsv \
                          --o-visualization 5_umv_core_metrics/evenness_group_sig.qzv

Beta diversity associations

Perform pairwise tests that will allow you to determine which specific pairs of groups differ from one another
Country based pair-wise associations

(qiime2)$ qiime diversity beta-group-significance \
                          --i-distance-matrix 5_umv_core_metrics/unweighted_unifrac_distance_matrix.qza \
                          --m-metadata-file metadata_1.tsv \
                          --m-metadata-column country \
                          --o-visualization 5_umv_core_metrics/unweighted_unifrac_country_pair.qzv \
                          --p-pairwise

Visualizations

(qiime2)$ qiime tools view 5_umv_core_metrics/shannon_group_sig.qzv
(qiime2)$ qiime tools view 5_umv_core_metrics/evenness_group_sig.qzv
(qiime2)$ qiime tools view 5_umv_core_metrics/faith_pd_group_sig.qzv
(qiime2)$ qiime tools view 5_umv_core_metrics/unweighted_unifrac_emperor.qzv
(qiime2)$ qiime tools view 5_umv_core_metrics/weighted_unifrac_emperor.qzv
(qiime2)$ qiime tools view 5_umv_core_metrics/jaccard_emperor.qzv
(qiime2)$ qiime tools view 5_umv_core_metrics/bray_curtis_emperor.qzv
(qiime2)$ qiime tools view 5_umv_core_metrics/unweighted_unifrac_country_pair.qzv

Alpha Rarefaction

Explore alpha diversity as a function of sampling depth
Computes one or more alpha diversity metrics at multiple sampling depths
Generates 10 rarefied tables at each sampling depth step & computes diversity metrics for all samples in the tables
Plots average diversity values for each sample at each even sampling depth and groups samples based on metadata

Parameter to check: –p-max-depth
(should be chosen from table.qzv)

(qiime2)$ qiime diversity alpha-rarefaction --i-table 2_umv_table.qza \
                                            --i-phylogeny 4_umv_rooted_tree.qza \
                                            --p-max-depth 6000 \
                                            --m-metadata-file metadata_1.tsv \
                                            --o-visualization 5_umv_core_metrics/alpha_rarefaction.qzv

# View
(qiime2)$ qiime tools view 5_umv_core_metrics/alpha_rarefaction.qzv

Pre-requisites | QC & mapping | de novo | Ampliseq

Drafted by: Aishwarya Barik
Edited by: Anwesh Maile