Day 3 - Ampliseq analysis with QIIME2
Input files
Dataset
Study: Human gut
microbiome viewed across age and geography
Library type: Single End
Sample Included: 15
Subset: 10,000 reads per sample
Amplicon: V4
Manifest file
A tab-separated file containing details of the raw reads
For Single End reads
sample-id | absolute-filepath |
---|---|
sample_1 | /path/to/the/F_read |
sample_2 | /path/to/the/F_read |
For Paired End reads
sample-id | forward-absolute-filepath | reverse-absolute-filepath |
---|---|---|
sample_1 | /path/to/the/F_read | /path/to/the/R_read |
sample_2 | /path/to/the/F_read | /path/to/the/R_read |
Metadata file
Additional details of the samples in a tsv file format
Sample names should match with those in the manifest file
Taxonomy Classifier
silva_138_V4_classifier: A V4 region specific classifier of Silva DB processed with RESCRIPt
Importing data
Import read data into QIIME2 pipeline
$ mamba activate qiime2
(qiime2)$ qiime tools import --type 'SampleData[SequencesWithQuality]' \
--input-path manifest_1 \
--output-path 1_umv_import.qza \
--input-format SingleEndFastqManifestPhred64V2
For Phred33 reads
(qiime2)$ qiime tools import --type 'SampleData[SequencesWithQuality]' \ --input-path manifest_file \ --output-path samples_import.qza \ --input-format SingleEndFastqManifestPhred33V2
For Paired End reads
(qiime2)$ qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' \ --input-path manifest_file \ --output-path samples_import.qza \ --input-format PairedEndFastqManifestPhred33V2
Importing Phred64 reads will take more time as these are converted to Phred33 reads in the background
De-multiplexing
De-multiplexing pooled data - Requires barcode data
These samples are already de-multiplexed
Generating ASVs
If required, remove low quality regions of the sequences, based on
the previous visualization 1_umv_summary.qzv
.
Parameters for filtering:
- –p-trim-left m (trims off the first m bases of each sequence)
- –p-trunc-len n (truncates each sequence at position n)
(qiime2)$ qiime dada2 denoise-single --i-demultiplexed-seqs 1_umv_import.qza \
--p-trunc-len 100 --p-n-threads 6 \
--o-representative-sequences 2_umv_rep_seqs.qza \
--o-table 2_umv_table.qza \
--o-denoising-stats 2_umv_stats.qza
FeatureTable[Frequency]: contains count or
frequencies of each unique sequence in each sample in the dataset (ASV
table)
FeatureData[Sequence]: maps feature identifiers in the
FeatureTable to the sequences they represent (Representative sequences
file).
For Paired End samples
(qiime2)$ qiime dada2 denoise-paired --i-demultiplexed-seqs samples_import.qza \ --p-trim-left-f 13 \ --p-trim-left-r 13 \ --p-trunc-len-f 200 \ --p-trunc-len-r 150 \ --o-table table.qza \ --o-representative-sequences rep_seqs.qza \ --o-denoising-stats denoising-stats.qza
Make sure there is at least 20 nucleotides overlap between the forward and reverse reads after trimming.
Visualizing the stats output
(qiime2)$ qiime metadata tabulate --m-input-file 2_umv_stats.qza \
--o-visualization 2_umv_stats.qzv
# View the output
(qiime2)$ qiime tools view 2_umv_stats.qzv
Merging different runs or studies of same amplicon
Use the same trimming parameters
Produces per sample summary of survived reads at each step of denoising.
Summarize
Summarizing feature table and representative sequences
# Feature table
(qiime2)$ qiime feature-table summarize --i-table 2_umv_table.qza \
--o-visualization 2_umv_table.qzv \
--m-sample-metadata-file metadata_1.tsv
# representative sequences
(qiime2)$ qiime feature-table tabulate-seqs --i-data 2_umv_rep_seqs.qza \
--o-visualization 2_umv_rep_seqs.qzv
# View
(qiime2)$ qiime tools view 2_umv_table.qzv
(qiime2)$ qiime tools view 2_umv_rep_seqs.qzv
Taxonomy
Assign taxonomy to the representative sequences
Increase
/tmp
size to 50GB
(qiime2)$ qiime feature-classifier classify-sklearn --i-classifier silva138_AB_V4_classifier.qza \
--i-reads 2_umv_rep_seqs.qza \
--o-classification 3_umv_taxonomy.qza
Phylogenetic Tree
Alpha and Beta diversity metrics
Rarefying a FeatureTable[Frequency] to user-specified sampling
depth
Parameter to check: –p-sampling-depth: even sampling
(i.e. rarefaction) depth
Decide by checking the table.qzv (2_umv_table.qzv
)
file.
(qiime2)$ qiime diversity core-metrics-phylogenetic --i-phylogeny 4_umv_rooted_tree.qza \
--i-table 2_umv_table.qza \
--p-sampling-depth 6162 \
--m-metadata-file metadata_1.tsv \
--output-dir 5_umv_core_metrics
Faith’s PD
A qualitative measure of community richness that incorporates phylogenetic relationships between the features
Shannon’s diversity index
Evenness
A measure of community evenness
Beta diversity associations
Perform pairwise tests that will allow you to determine which
specific pairs of groups differ from one another
Country based pair-wise associations
Visualizations
(qiime2)$ qiime tools view 5_umv_core_metrics/shannon_group_sig.qzv
(qiime2)$ qiime tools view 5_umv_core_metrics/evenness_group_sig.qzv
(qiime2)$ qiime tools view 5_umv_core_metrics/faith_pd_group_sig.qzv
(qiime2)$ qiime tools view 5_umv_core_metrics/unweighted_unifrac_emperor.qzv
(qiime2)$ qiime tools view 5_umv_core_metrics/weighted_unifrac_emperor.qzv
(qiime2)$ qiime tools view 5_umv_core_metrics/jaccard_emperor.qzv
(qiime2)$ qiime tools view 5_umv_core_metrics/bray_curtis_emperor.qzv
(qiime2)$ qiime tools view 5_umv_core_metrics/unweighted_unifrac_country_pair.qzv
Alpha Rarefaction
- Explore alpha diversity as a function of sampling depth
- Computes one or more alpha diversity metrics at multiple sampling depths
- Generates 10 rarefied tables at each sampling depth step & computes diversity metrics for all samples in the tables
- Plots average diversity values for each sample at each even sampling depth and groups samples based on metadata
Parameter to check: –p-max-depth
(should be chosen from table.qzv)
(qiime2)$ qiime diversity alpha-rarefaction --i-table 2_umv_table.qza \
--i-phylogeny 4_umv_rooted_tree.qza \
--p-max-depth 6000 \
--m-metadata-file metadata_1.tsv \
--o-visualization 5_umv_core_metrics/alpha_rarefaction.qzv
# View
(qiime2)$ qiime tools view 5_umv_core_metrics/alpha_rarefaction.qzv
Pre-requisites | QC & mapping | de novo | Ampliseq
Drafted by: Aishwarya Barik
Edited by: Anwesh Maile