Installation

ksrates is available as a Singularity or Docker container, which bundle ksrates and all required external software dependencies, and as a Python package, which requires manual installation of the package and its dependencies but allows for more flexibility in integrating it into existing bioinformatics environments and toolchains. In addition to a simple and easy to use command-line interface (CLI), we also provide a user-friendly Nextflow pipeline that allows to run a complete ksrates analysis fully automated.

ksrates runs on any Linux or macOS system or on Windows with Windows Subsystem for Linux 2 (WSL2) or a virtual machine installed. However, ksrates analyses are computationally demanding, and therefore we recommend the use of a computer cluster (or cloud platform) for any but the simplest data sets.

Singularity and Docker containers

Containers are standalone portable runtime environments that package everything needed to run a software, including application code, external software dependencies and operating system libraries and runtime, and are thus executable in any computing environment for which Singularity and Docker container engines are available. This comes in handy, for example, when local installation of ksrates and its software dependencies are not possible, for instance due to permission issues, or for deploying ksrates to a computer cluster or cloud.

Availability and dependencies

Singularity runs natively only on Linux. On Windows it requires either WSL2 (recommended; see Note below) or a virtual machine (VM). On macOS it is available as a beta version or it also requires a VM. Singularity has the advantage over Docker of always producing output files with non-root permissions.

Note

WSL2 (Windows Subsystem for Linux 2) is a native Windows 10 feature that allows to run a GNU/Linux terminal without the use of a VM. It can be installed following the official documentation.

Docker runs natively on both Linux and Windows, while on macOS it can be installed as an application that makes use of a VM under the hood. When working on Linux machines, Docker produces output files that require root permissions to be handled (e.g. to delete them), which is an issue for users who don’t have root permissions. Running Docker on Windows and macOS does not have such problems because the user has more control on output file permissions.

The table below summarizes relevant differences between Singularity and Docker containers/engines:

Supported (✓) and unsupported (✗) features.

Feature

Singularity

Docker

Runs on Linux

Runs on Windows

✓ (WSL2 or VM)

Runs on macOS

✓ (beta or VM)

✓ (VM)

Root privilege needed

Docker

When using the ksrates Docker container, either to run the ksrates CLI or Nextflow pipeline, the machine (i.e. a local computer or a remote computer cluster or cloud node) needs to have Docker installed. More information can be found on the Docker installation page.

When using the ksrates Nextflow pipeline with the ksrates Docker container, the container will be automatically downloaded from the vibpsb/ksrates repository on Docker Hub on first launch (this may take awhile depending on your Internet connection speed since the container has a size of about 1 GB) and will then be stored and reused for successive runs.

Manual installation

When not using or not being able to use one of the ksrates containers, for example to integrate the tool into existing bioinformatics environments and toolchains, the installation of the ksrates Python package and its dependencies can or has to be carried out manually. The following commands guide you through the installation on a Linux machine. Windows users can carry out the installation with the same commands by using either WSL2 (recommended; see Note below) or a virtual machine (VM) with Linux installed. macOS users can for example use Homebrew instead of apt-get.

Note

WSL2 (Windows Subsystem for Linux 2) is a native Windows 10 feature that allows to run a GNU/Linux terminal without the use of a VM. It can be installed following the official documentation.

  1. Most of the non-Python dependencies can be installed with the following commands:

    sudo apt-get update && sudo apt-get -yq install python3-pip default-jdk build-essential ncbi-blast+ muscle fasttree mcl phyml | bash
    
  2. Install PAML 4.9j from source (for more information see PAML installation page) to avoid compatibility issues:

    wget http://abacus.gene.ucl.ac.uk/software/paml4.9j.tgz
    tar -xzf paml4.9j.tgz
    cd paml4.9j/src && make -f Makefile
    

    Then make the executable codeml available through the $PATH variable (the downloaded PAML directory can be deleted):

    • Either move codeml to a directory already present in $PATH, e.g. usr/local/bin:

      sudo mv codeml usr/local/bin
      
    • Or move codeml to another directory (here assumed to be ~/bin) and add this directory to $PATH, for the Bash shell by copying the following line to the shell initialization file (e.g. .bashrc):

      export PATH=$PATH:~/bin
      
  3. Install i-ADHoRe 3.0 from its GitHub page (required only for collinearity analysis of genome data for the focal species).

  4. Clone the ksrates repository from GitHub and install the package and its Python dependencies:

    git clone https://github.com/VIB-PSB/ksrates
    cd ksrates
    pip3 install .
    

Testing your installation

Note

WSL2 users can enter the Windows file system from the terminal through e.g. cd mnt/c/Users/your_username.

  1. Clone the ksrates repository from GitHub to get the use case dataset:

    git clone https://github.com/VIB-PSB/ksrates
    
  2. Access the test directory in a terminal:

    cd ksrates/test
    
  3. Launch ksrates (the execution will take few minutes):

    nextflow run VIB-PSB/ksrates --config ./config_elaeis.txt
    

    Nextflow will download ksrates and will by default run the test pipeline on the local executor using the ksrates Singularity container, as configured in the included nextflow.config Nextflow configuration file (automatically detected). If needed, please adapt the configuration to the available resources (e.g. available CPUs/cores or switching to a Docker container or no container at all for a local installation) as described in the Nextflow configuration file section.

Updating your installation

  • To update the ksrates Nextflow pipeline to the latest release, run the following command:

    nextflow pull VIB-PSB/ksrates
    
  • To update the Docker container image, run the following command to pull the new image from Docker Hub:

    docker pull vibpsb/ksrates:latest
    
  • To update the Singularity container image, first remove the old image (when using the ksrates Nextflow pipeline the image is stored in the cacheDir directory set in the nextflow.config or, if not set, by default in work/singularity in the project folder):

    rm vibpsb-ksrates-latest.img
    

    The next time the pipeline is launched, Nextflow will automatically pull the new image from Docker Hub.

Alternatively, run the following command in the same directory of the old image to manually pull the new image from Docker Hub:

singularity pull vibpsb-ksrates-latest.img docker://vibpsb/ksrates:latest
  • To update your manual installation, uninstall the old version of ksrates package, clone the ksrates repository from GitHub and re-install the package:

    pip3 uninstall ksrates
    git clone https://github.com/VIB-PSB/ksrates
    cd ksrates
    pip3 install .