Day 3 - Ampliseq analysis with QIIME2

Input files

Dataset

Study: Human gut microbiome viewed across age and geography
Library type: Single End
Sample Included: 15
Subset: 10,000 reads per sample
Amplicon: V4

Manifest file

A tab-separated file containing details of the raw reads

For Single End reads

sample-id absolute-filepath
sample_1 /path/to/the/F_read
sample_2 /path/to/the/F_read

For Paired End reads

sample-id forward-absolute-filepath reverse-absolute-filepath
sample_1 /path/to/the/F_read /path/to/the/R_read
sample_2 /path/to/the/F_read /path/to/the/R_read

Metadata file

Additional details of the samples in a tsv file format
Sample names should match with those in the manifest file

Taxonomy Classifier

silva_138_V4_classifier: A V4 region specific classifier of Silva DB processed with RESCRIPt

Importing data

Import read data into QIIME2 pipeline

$ mamba activate qiime2
(qiime2)$ qiime tools import --type 'SampleData[SequencesWithQuality]' \
                             --input-path manifest_1 \
                             --output-path 1_umv_import.qza \
                             --input-format SingleEndFastqManifestPhred64V2

For Phred33 reads

(qiime2)$ qiime tools import --type 'SampleData[SequencesWithQuality]' \
                             --input-path manifest_file \
                             --output-path samples_import.qza \
                             --input-format SingleEndFastqManifestPhred33V2

For Paired End reads

(qiime2)$ qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' \
                             --input-path manifest_file \
                             --output-path samples_import.qza \
                             --input-format PairedEndFastqManifestPhred33V2

Importing Phred64 reads will take more time as these are converted to Phred33 reads in the background

De-multiplexing

De-multiplexing pooled data - Requires barcode data
These samples are already de-multiplexed

Summarize

(qiime2)$ qiime demux summarize --i-data 1_umv_import.qza \
                                --o-visualization 1_umv_summary.qzv
# View summary
(qiime2)$ qiime tools view 1_umv_summary.qzv

Generating ASVs

If required, remove low quality regions of the sequences, based on the previous visualization 1_umv_summary.qzv.

Parameters for filtering:

  1. –p-trim-left m (trims off the first m bases of each sequence)
  2. –p-trunc-len n (truncates each sequence at position n)
(qiime2)$ qiime dada2 denoise-single --i-demultiplexed-seqs 1_umv_import.qza \
                                     --p-trunc-len 100 --p-n-threads 6 \
                                     --o-representative-sequences 2_umv_rep_seqs.qza \
                                     --o-table 2_umv_table.qza \
                                     --o-denoising-stats 2_umv_stats.qza

FeatureTable[Frequency]: contains count or frequencies of each unique sequence in each sample in the dataset (ASV table)
FeatureData[Sequence]: maps feature identifiers in the FeatureTable to the sequences they represent (Representative sequences file).

For Paired End samples

(qiime2)$ qiime dada2 denoise-paired --i-demultiplexed-seqs samples_import.qza \
                                     --p-trim-left-f 13 \
                                     --p-trim-left-r 13 \
                                     --p-trunc-len-f 200 \
                                     --p-trunc-len-r 150 \
                                     --o-table table.qza \
                                     --o-representative-sequences rep_seqs.qza \
                                     --o-denoising-stats denoising-stats.qza

Make sure there is at least 20 nucleotides overlap between the forward and reverse reads after trimming.

Visualizing the stats output

(qiime2)$ qiime metadata tabulate --m-input-file 2_umv_stats.qza \
                                  --o-visualization 2_umv_stats.qzv
# View the output
(qiime2)$ qiime tools view 2_umv_stats.qzv

Merging different runs or studies of same amplicon

Use the same trimming parameters

# Merge feature tables
$ qiime feature-table merge --i-tables table-1.qza \
                            --i-tables table-2.qza \
                            --o-merged-table table.qza
# Merge rep_seqs
$ qiime feature-table merge-seqs --i-data rep_seqs-1.qza \
                                 --i-data rep_seqs-2.qza \
                                 --o-merged-data rep_seqs.qza

Produces per sample summary of survived reads at each step of denoising.

Summarize

Summarizing feature table and representative sequences

# Feature table
(qiime2)$ qiime feature-table summarize --i-table 2_umv_table.qza \
                                        --o-visualization 2_umv_table.qzv \
                                        --m-sample-metadata-file metadata_1.tsv
# representative sequences
(qiime2)$ qiime feature-table tabulate-seqs --i-data 2_umv_rep_seqs.qza \
                                            --o-visualization 2_umv_rep_seqs.qzv

# View
(qiime2)$ qiime tools view 2_umv_table.qzv
(qiime2)$ qiime tools view 2_umv_rep_seqs.qzv

Taxonomy

Assign taxonomy to the representative sequences

Increase /tmp size to 50GB

(qiime2)$ qiime feature-classifier classify-sklearn --i-classifier silva138_AB_V4_classifier.qza \
                                                    --i-reads 2_umv_rep_seqs.qza \
                                                    --o-classification 3_umv_taxonomy.qza

View taxonomy output

(qiime2)$ qiime metadata tabulate --m-input-file 3_umv_taxonomy.qza \
                                  --o-visualization 3_umv_taxonomy.qzv
# View
(qiime2)$ qiime tools view 3_umv_taxonomy.qzv

Provides an output showing taxonomic composition for each Feature/ASV.

Taxa bar plot

To visualize taxonomic composition with interactive bar plots

(qiime2)$ qiime taxa barplot --i-table 2_umv_table.qza \
                             --i-taxonomy 3_umv_taxonomy.qza \
                             --m-metadata-file metadata_1.tsv \
                             --o-visualization 3_umv_taxa_plots.qzv
# View
(qiime2)$ qiime tools view 3_umv_taxa_plots.qzv

Phylogenetic Tree

(qiime2)$ qiime phylogeny align-to-tree-mafft-fasttree --i-sequences 2_umv_rep_seqs.qza \
                                                       --o-alignment 4_umv_aligned_rep_seqs.qza \
                                                       --o-masked-alignment 4_umv_masked_rep_seqs.qza \
                                                       --o-tree 4_umv_unrooted_tree.qza \
                                                       --o-rooted-tree 4_umv_rooted_tree.qza

Alpha and Beta diversity metrics

Rarefying a FeatureTable[Frequency] to user-specified sampling depth
Parameter to check: –p-sampling-depth: even sampling (i.e. rarefaction) depth
Decide by checking the table.qzv (2_umv_table.qzv) file.

(qiime2)$ qiime diversity core-metrics-phylogenetic --i-phylogeny 4_umv_rooted_tree.qza \
                                                    --i-table 2_umv_table.qza \
                                                    --p-sampling-depth 6162 \
                                                    --m-metadata-file metadata_1.tsv \
                                                    --output-dir 5_umv_core_metrics

Faith’s PD

A qualitative measure of community richness that incorporates phylogenetic relationships between the features

(qiime2)$ qiime diversity alpha-group-significance \
                          --i-alpha-diversity 5_umv_core_metrics/faith_pd_vector.qza \
                          --m-metadata-file metadata_1.tsv \
                          --o-visualization 5_umv_core_metrics/faith_pd_group_sig.qzv

Shannon’s diversity index

(qiime2)$ qiime diversity alpha-group-significance \
                          --i-alpha-diversity 5_umv_core_metrics/shannon_vector.qza \
                          --m-metadata-file metadata_1.tsv \
                          --o-visualization 5_umv_core_metrics/shannon_group_sig.qzv

Evenness

A measure of community evenness

(qiime2)$ qiime diversity alpha-group-significance \
                          --i-alpha-diversity 5_umv_core_metrics/evenness_vector.qza \
                          --m-metadata-file metadata_1.tsv \
                          --o-visualization 5_umv_core_metrics/evenness_group_sig.qzv

Beta diversity associations

Perform pairwise tests that will allow you to determine which specific pairs of groups differ from one another
Country based pair-wise associations

(qiime2)$ qiime diversity beta-group-significance \
                          --i-distance-matrix 5_umv_core_metrics/unweighted_unifrac_distance_matrix.qza \
                          --m-metadata-file metadata_1.tsv \
                          --m-metadata-column country \
                          --o-visualization 5_umv_core_metrics/unweighted_unifrac_country_pair.qzv \
                          --p-pairwise

Visualizations

(qiime2)$ qiime tools view 5_umv_core_metrics/shannon_group_sig.qzv
(qiime2)$ qiime tools view 5_umv_core_metrics/evenness_group_sig.qzv
(qiime2)$ qiime tools view 5_umv_core_metrics/faith_pd_group_sig.qzv
(qiime2)$ qiime tools view 5_umv_core_metrics/unweighted_unifrac_emperor.qzv
(qiime2)$ qiime tools view 5_umv_core_metrics/weighted_unifrac_emperor.qzv
(qiime2)$ qiime tools view 5_umv_core_metrics/jaccard_emperor.qzv
(qiime2)$ qiime tools view 5_umv_core_metrics/bray_curtis_emperor.qzv
(qiime2)$ qiime tools view 5_umv_core_metrics/unweighted_unifrac_country_pair.qzv

Alpha Rarefaction

  1. Explore alpha diversity as a function of sampling depth
  2. Computes one or more alpha diversity metrics at multiple sampling depths
  3. Generates 10 rarefied tables at each sampling depth step & computes diversity metrics for all samples in the tables
  4. Plots average diversity values for each sample at each even sampling depth and groups samples based on metadata

Parameter to check: –p-max-depth
(should be chosen from table.qzv)

(qiime2)$ qiime diversity alpha-rarefaction --i-table 2_umv_table.qza \
                                            --i-phylogeny 4_umv_rooted_tree.qza \
                                            --p-max-depth 6000 \
                                            --m-metadata-file metadata_1.tsv \
                                            --o-visualization 5_umv_core_metrics/alpha_rarefaction.qzv

# View
(qiime2)$ qiime tools view 5_umv_core_metrics/alpha_rarefaction.qzv



Pre-requisites | QC & mapping | de novo | Ampliseq

Drafted by: Aishwarya Barik
Edited by: Anwesh Maile