Requirements¶
Python dependencies¶
Wespipeline depends on an existing installation of the library Luigi. If this package was installed following the recomended steps, this dependency should be fulfilled.
External dependencies¶
Whole exome sequencing variant calling analysis needs for external programs for doing both the processing, and the analysis and summaries. Following is a list of the different dependencies.
Optionally, some type of database is needed for making use of the persistent storage of executions. In the configuration proposed, is used.
Even though pip packages dependencies are resolved upon installation, third party tools are not. These extra dependencies are not compulsary for all executions of the pipeline, but depend on the parameters and tasks selected.
Each of the dependencies correspond to a specific need in one or more of the steps, and thus are organized in that manner bellow.
Secuence retrieval : Sra Toolkit, Fastqc
Reference genome retrieval : No needed dependency
Secuence alignment : Bwa
Alignment processing : Bwa Samtools,
Variant calling : Freebayes, Varscan, Gatk, Deepvariant
Variant calling evaluation : Vcf tools
Installing through Anaconda distributions¶
Even though most of the programs listed can be installed through various different ways, I encourage the use of the Anaconda Distribution, one of the biggest platforms for installing the tools from well trusted sources. Optionally, Miniconda can be used too for a lighter version of the package manager.
Installing miniconda is a simple task. Following an example installation for a x64 linux machine:
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda.
bash ~/miniconda.sh -b -p $HOME/miniconda && rm ~/miniconda.sh
export PATH="$HOME/miniconda/bin:$PATH"
Beware that, in order for the utilities and installed packages to be accessible the environment must be activated:
source $HOME/miniconda/bin/activate
The package archive is distributed through different channels, two of which are needed for the installation of these packages. Easier than specifying the channel for each command is adding the channels:
conda config --add channels bioconda
conda config --add channels conda-forge
Installing from the repositories is a simple task doable through one-liner commands. Following is an elaborated list of the installation commands for all of the external depenedencies listed above, and a command for instaling them together:
Installing Samtools
conda install -y samtools
Installing Bwa
conda install -y bwa
Installing Picard
conda install -y picard
Installing Platypus
conda install -y platypus-variant
Installing Varscan
conda install -y varscan
Installing Freebayes
conda install -y freebayes
Installing VCFtools
conda install -y vcftools
Installing Fastqc
conda install -y fastqc
Installing Sra Toolkit
conda install -y sra-tools
Installing all dependencies with a single command:
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda.sh
bash ~/miniconda.sh -b -p $HOME/miniconda
export PATH="$HOME/miniconda/bin:$PATH"
source $HOME/miniconda/bin/activate && \
conda config --add channels bioconda && \
conda config --add channels conda-forge && \
conda install -y samtools && \
conda install -y bwa && \
conda install -y picard && \
conda install -y platypus-variant && \
conda install -y varscan && \
conda install -y freebayes && \
conda install -y vcftools && \
conda install -y gatk && \
conda install -y vt
rm ~/miniconda.sh