Genomics_WS2023
Installing Linux
You may choose either to dual boot your existing system or install a virtual box
- Dual booting Windows and Linux - Tutorial - Bit complicated for beginners, but worth the hassle
- Installing Ubuntu on virtual box - Video - This will be slow. Do not run memory intensive programs.
If you want to install Anaconda instead of Mambaforge, the above video has instructions.
For Mambaforge installation proceed with this document.
Mambaforge
An alternative for Anaconda
Download and run install_mamba.sh script
This will download Mambaforge script, install and add
.condarc
to thehome
directory
Follow the on-screen instructions during installation
Restart the terminal
Check the installation
Installing Packages
We will be installing 21 packages along with their dependencies in 16 environments
- Download install_chk.txt - A list of envs to check from, during installation
- Download install_tools.sh
- Move both the files to home directory and run
install_tools.sh
If you want to install the packages manually, open the
install_tools.sh
file with a text editor (likegedit
) and see the corresponding commands for creating each environment and installing associated tools. Say for example, if you wnt to install fastqc and bbduk in qc environment you have to run the following command…Or you can create the environment first and install the tools later
You can search for the available packages at this site
Remember to use
mamba
instead ofconda
, if you have installed mambaforge.
CheckM installation takes more time, it will be done separately.
- Open another terminal window
- Download install_checkm.sh
- Move
install_checkm.sh
to home directory and run it - Do not run CheckM, unless you have at least 40GB of
SWAP
Check the installations
- Download installations_check.sh
- Move
installations_check.sh
to home directory and run it
Datasets
Genomes
Amplicons
- UMV
- Indian
Directory structure
Download ws_jul2023.zip
and extract to your home
directory
ws_jul2023
|- ampliseq
|- raw_reads
|- manifest_1
|- manifest_2
|- metadata_1.tsv
|- metadata_all.tsv
|- silva138_AB_V4_classifier.qza
|- bga
|- raw_reads
|- ref_genome
|- bb_adapters.fa
Basic Linux Commands
Try this document or search online for Bash Tutorials
Pre-requisites | QC & mapping | de novo | Ampliseq
Scripts made by: Abhishek Khatri
Document prepared by: Anwesh Maile