Installation
ksrates is available as a Singularity or Docker container, which bundle ksrates and all required external software dependencies, and as a Python package, which requires manual installation of the package and its dependencies but allows for more flexibility in integrating it into existing bioinformatics environments and toolchains. In addition to a simple and easy to use command-line interface (CLI), we also provide a user-friendly Nextflow pipeline that allows to run a complete ksrates analysis fully automated.
ksrates runs on any Linux or macOS system or on Windows with Windows Subsystem for Linux 2 (WSL2) or a virtual machine installed. However, ksrates analyses are computationally demanding, and therefore we recommend the use of a computer cluster (or cloud platform) for any but the simplest data sets.
Nextflow (recommended)
The ksrates Nextflow pipeline makes it easy to execute the individual steps of a ksrates analysis as a single command. Nextflow also makes it easy to configure the execution of the pipeline on a variety of computer clusters (see the Nextflow configuration file section).
To install Nextflow and its dependencies, follow the commands below (or the official Nextflow installation instructions).
Make sure you have Bash 3.2 (or later) installed.
If you do not have Java installed, install Java 11 or later; on Linux you can for example use:
sudo apt-get install default-jdk
Note
Nextflow versions before
22.01.x-edge
require Java version 8 up to 15 for their execution.
Then install Nextflow using either:
wget -qO- https://get.nextflow.io | bash
or:
curl -fsSL https://get.nextflow.io | bash
This creates the
nextflow
executable file in the current directory.
Note
ksrates’s Nextflow pipeline is written using the older DSL1 syntax, which will be removed after Nextflow version 22.10.x
in favor of DSL2. Using Nextflow version 22.03.0-edge
or later will require the use of ksrates version v1.1.3
or later. We will do our best to preserve compatibility with future Nextflow versions.
Optionally make the
nextflow
executable accessible by your$PATH
variable, for example by moving it:sudo mv nextflow /usr/local/bin
The ksrates Nextflow pipeline itself does not need to be installed and will be automatically downloaded and set up simply when you execute the launch of the ksrates Nextflow pipeline for the first time. The same applies to the ksrates package and its dependencies when using the ksrates Singularity or Docker container (see below). In other words, if you plan on only using the ksrates Nextflow pipeline with a container it is not necessary to manually download or install ksrates itself, the only other software you may need to install is the Singularity or Docker software itself (next section). (Note that the ksrates Nextflow pipeline code is however also included in the ksrates GitHub repository and can thus also be executed from a manually installed ksrates package (see the Manual installation section below).)
Singularity and Docker containers
Containers are standalone portable runtime environments that package everything needed to run a software, including application code, external software dependencies and operating system libraries and runtime, and are thus executable in any computing environment for which Singularity and Docker container engines are available. This comes in handy, for example, when local installation of ksrates and its software dependencies are not possible, for instance due to permission issues, or for deploying ksrates to a computer cluster or cloud.
Availability and dependencies
Singularity runs natively only on Linux. On Windows it requires either WSL2 (recommended; see Note below) or a virtual machine (VM). On macOS it is available as a beta version or it also requires a VM. Singularity has the advantage over Docker of always producing output files with non-root permissions.
Note
WSL2 (Windows Subsystem for Linux 2) is a native Windows 10 feature that allows to run a GNU/Linux terminal without the use of a VM. It can be installed following the official documentation.
Docker runs natively on both Linux and Windows, while on macOS it can be installed as an application that makes use of a VM under the hood. When working on Linux machines, Docker produces output files that require root permissions to be handled (e.g. to delete them), which is an issue for users who don’t have root permissions. Running Docker on Windows and macOS does not have such problems because the user has more control on output file permissions.
The table below summarizes relevant differences between Singularity and Docker containers/engines:
Feature |
Singularity |
Docker |
---|---|---|
Runs on Linux |
✓ |
✓ |
Runs on Windows |
✓ (WSL2 or VM) |
✓ |
Runs on macOS |
✓ (beta or VM) |
✓ (VM) |
Root privilege needed |
✗ |
✓ |
Singularity (recommended)
When using the ksrates Singularity container, either to run the ksrates CLI or Nextflow pipeline, the machine (i.e. a local computer or a remote computer cluster or cloud node) needs to have Singularity installed (ksrates has been tested with version 3.7). More information can be found on the Singularity Quick Start page. For a Linux installation we suggest to follow the Install from Source section (Install Dependencies, Install Go, Download Singularity from a release and Compile Singularity). For up-to-date and version-specific instructions, please refer to this page.
Note
To allow users to run the pipeline from any directory in a cluster (i.e. not necessarily from their home directory), the user bind control feature needs to be left active during Singularity installation [Default: “YES”].
When using the ksrates Nextflow pipeline with the ksrates Singularity container, the container will be automatically downloaded from the vibpsb/ksrates
repository on Docker Hub on first launch (this may take awhile depending on your Internet connection speed since the container has a size of about 1 GB) and will then be stored and reused for successive runs.
Docker
When using the ksrates Docker container, either to run the ksrates CLI or Nextflow pipeline, the machine (i.e. a local computer or a remote computer cluster or cloud node) needs to have Docker installed. More information can be found on the Docker installation page.
When using the ksrates Nextflow pipeline with the ksrates Docker container, the container will be automatically downloaded from the vibpsb/ksrates
repository on Docker Hub on first launch (this may take awhile depending on your Internet connection speed since the container has a size of about 1 GB) and will then be stored and reused for successive runs.
Manual installation
When not using or not being able to use one of the ksrates containers, for example to integrate the tool into existing bioinformatics environments and toolchains, the installation of the ksrates Python package and its dependencies can or has to be carried out manually. The following commands guide you through the installation on a Linux machine. Windows users can carry out the installation with the same commands by using either WSL2 (recommended; see Note below) or a virtual machine (VM) with Linux installed. macOS users can for example use Homebrew instead of apt-get
.
Note
WSL2 (Windows Subsystem for Linux 2) is a native Windows 10 feature that allows to run a GNU/Linux terminal without the use of a VM. It can be installed following the official documentation.
Most of the non-Python dependencies can be installed with the following commands:
sudo apt-get update && sudo apt-get -yq install python3-pip default-jdk build-essential ncbi-blast+ muscle fasttree mcl phyml | bash
Install PAML 4.9j from source (for more information see PAML installation page) to avoid compatibility issues:
wget http://abacus.gene.ucl.ac.uk/software/paml4.9j.tgz tar -xzf paml4.9j.tgz cd paml4.9j/src && make -f Makefile
Then make the executable
codeml
available through the$PATH
variable (the downloaded PAML directory can be deleted):Either move
codeml
to a directory already present in$PATH
, e.g.usr/local/bin
:sudo mv codeml usr/local/bin
Or move
codeml
to another directory (here assumed to be~/bin
) and add this directory to$PATH
, for the Bash shell by copying the following line to the shell initialization file (e.g..bashrc
):export PATH=$PATH:~/bin
Install i-ADHoRe 3.0 from its GitHub page (required only for collinearity analysis of genome data for the focal species).
Clone the ksrates repository from GitHub and install the package and its Python dependencies:
git clone https://github.com/VIB-PSB/ksrates cd ksrates pip3 install .
Testing your installation
Note
WSL2 users can enter the Windows file system from the terminal through e.g. cd mnt/c/Users/your_username
.
Clone the ksrates repository from GitHub to get the use case dataset:
git clone https://github.com/VIB-PSB/ksrates
Access the
test
directory in a terminal:cd ksrates/test
Launch ksrates (the execution will take few minutes):
nextflow run VIB-PSB/ksrates --config ./config_elaeis.txt
Nextflow will download ksrates and will by default run the test pipeline on the
local
executor using the ksrates Singularity container, as configured in the includednextflow.config
Nextflow configuration file (automatically detected). If needed, please adapt the configuration to the available resources (e.g. available CPUs/cores or switching to a Docker container or no container at all for a local installation) as described in the Nextflow configuration file section.
Updating your installation
To update the ksrates Nextflow pipeline to the latest release, run the following command:
nextflow pull VIB-PSB/ksrates
To update the Docker container image, run the following command to pull the new image from Docker Hub:
docker pull vibpsb/ksrates:latest
To update the Singularity container image, first remove the old image (when using the ksrates Nextflow pipeline the image is stored in the
cacheDir
directory set in thenextflow.config
or, if not set, by default inwork/singularity
in the project folder):rm vibpsb-ksrates-latest.img
The next time the pipeline is launched, Nextflow will automatically pull the new image from Docker Hub.
Alternatively, run the following command in the same directory of the old image to manually pull the new image from Docker Hub:
singularity pull vibpsb-ksrates-latest.img docker://vibpsb/ksrates:latest
To update your manual installation, uninstall the old version of ksrates package, clone the ksrates repository from GitHub and re-install the package:
pip3 uninstall ksrates git clone https://github.com/VIB-PSB/ksrates cd ksrates pip3 install .