Installation

ksrates is available as Apptainer (formerly Singularity) or Docker container, which bundle ksrates and all required external software dependencies, and as a Python package, which requires manual installation of the package and its dependencies but allows for more flexibility in integrating it into existing bioinformatics environments and toolchains. In addition to a simple and easy to use command-line interface (CLI), we also provide a user-friendly Nextflow pipeline that allows to run a complete ksrates analysis fully automated.

ksrates runs on any Linux or macOS system or on Windows with Windows Subsystem for Linux 2 (WSL2) or a virtual machine installed. However, ksrates analyses are computationally demanding, and therefore we recommend the use of a computer cluster (or cloud platform) for any but the simplest data sets.

Apptainer and Docker containers

Containers are standalone portable runtime environments that package everything needed to run a software, including application code, external software dependencies and operating system libraries and runtime, and are thus executable in any computing environment for which Apptainer and Docker container engines are available. This comes in handy, for example, when local installation of ksrates and its software dependencies are not possible, for instance due to permission issues, or for deploying ksrates to a computer cluster or cloud.

Availability and dependencies

Apptainer runs natively only on Linux. On Windows it requires either WSL2 (recommended; see Note below) or a virtual machine (VM). On macOS it is available as a beta version or it also requires a VM. Apptainer has the advantage over Docker of always producing output files with non-root permissions.

Note

WSL2 (Windows Subsystem for Linux 2) is a native Windows 10 feature that allows to run a GNU/Linux terminal without the use of a VM. It can be installed following the official documentation.

Docker runs natively on both Linux and Windows, while on macOS it can be installed as an application that makes use of a VM under the hood. When working on Linux machines, Docker produces output files that require root permissions to be handled (e.g. to delete them), which is an issue for users who don’t have root permissions. Running Docker on Windows and macOS does not have such problems because the user has more control on output file permissions.

The table below summarizes relevant differences between Apptainer and Docker containers/engines:

Supported (✓) and unsupported (✗) features.

Feature

Apptainer

Docker

Runs on Linux

Runs on Windows

✓ (WSL2 or VM)

Runs on macOS

✓ (beta or VM)

✓ (VM)

Root privilege needed

Docker

When using the ksrates Docker container, either to run the ksrates CLI or Nextflow pipeline, the machine (i.e. a local computer or a remote computer cluster or cloud node) needs to have Docker installed. More information can be found on the Docker installation page.

When using the ksrates Nextflow pipeline with the ksrates Docker container, the container will be automatically downloaded from the vibpsb/ksrates repository on Docker Hub on first launch (this may take a while depending on your Internet connection speed since the container has a size of about 1 GB) and will then be stored and reused for successive runs.

Manual installation

When not using or not being able to use one of the ksrates containers, for example to integrate the tool into existing bioinformatics environments and toolchains, the installation of the ksrates Python package and its dependencies can or has to be carried out manually. The following commands guide you through the installation on a Linux machine. Windows users can carry out the installation with the same commands by using either WSL2 (recommended; see Note below) or a virtual machine (VM) with Linux installed. macOS users can for example use Homebrew instead of apt-get.

Note

WSL2 (Windows Subsystem for Linux 2) is a native Windows 10 feature that allows to run a GNU/Linux terminal without the use of a VM. It can be installed following the official documentation.

  1. Most of the non-Python dependencies can be installed with the following commands:

    sudo apt-get update && sudo apt-get -yq install python3-pip default-jdk build-essential ncbi-blast+ muscle fasttree mcl phyml | bash
    

    Note

    The workflow was developed using BLAST 2.6.0+ and it might error when using a much older versions such as 2.12.0+ (see GitHub issue #48 ). Moreover it is only compatible with MUSCLE v3 (v3.8.31) and not with v5 (see GitHub issue #41).

  2. Install PAML 4.9j from source (for more information see PAML installation page) to avoid compatibility issues:

    wget http://abacus.gene.ucl.ac.uk/software/paml4.9j.tgz
    tar -xzf paml4.9j.tgz
    cd paml4.9j/src && make -f Makefile
    

    Then make the executable codeml available through the $PATH variable (the downloaded PAML directory can be deleted):

    • Either move codeml to a directory already present in $PATH, e.g. usr/local/bin:

      sudo mv codeml usr/local/bin
      
    • Or move codeml to another directory (here assumed to be ~/bin) and add this directory to $PATH, for the Bash shell by copying the following line to the shell initialization file (e.g. .bashrc):

      export PATH=$PATH:~/bin
      
  3. Install i-ADHoRe 3.0 from its GitHub page (required only for collinearity analysis of genome data for the focal species).

  4. Clone the ksrates repository from GitHub and install the package and its Python dependencies:

    git clone https://github.com/VIB-PSB/ksrates
    # Starting from v2.0.0, when running outside a container, download also this compressed file:
    wget https://zenodo.org/records/15225340/files/original_angiosperm_sequences.tar.gz -P ksrates/ksrates/reciprocal_retention
    # Move to the ksrates subdirectory and install the package with ``pip``
    cd ksrates
    pip3 install .
    

Testing your installation

Note

WSL2 users can enter the Windows file system from the terminal through e.g. cd mnt/c/Users/your_username.

  1. Clone the ksrates repository from GitHub in order to access the test dataset:

    git clone https://github.com/VIB-PSB/ksrates
    cd ksrates/test
    
  2. Launch the Nextflow ksrates pipeline with Apptainer (the execution will take few minutes):

    nextflow run VIB-PSB/ksrates --test -profile apptainer --config config_files/config_elaeis.txt --expert config_files/config_expert.txt
    

    The first time the command is executed, Nextflow downloads a local copy of the ksrates Nextflow pipeline from the VIB-PSB/ksrates GitHub repository and stores it in the $HOME/.nextflow directory. Parameter --test is mandatory when running the test dataset. Parameter -profile pulls the Apptainer (or Docker) container from Docker Hub.

    Note

    Since the Apptainer image is by default stored in the launching folder under work/singularity, it is recommended to specify a “centralized” destination path through apptainer.cacheDir in the Nextflow configuration file located in the test directory (nextflow.config, automatically detected). See Nextflow configuration file section.

    Alternatively to the Nextflow pipeline, test by executing the individual steps of the manual pipeline (with or without container):

    apptainer exec docker://vibpsb/ksrates ksrates init config_files/config_elaeis.txt --expert config_files/config_expert.txt
    apptainer exec docker://vibpsb/ksrates ksrates paralogs-ks --test config_files/config_elaeis.txt --expert config_files/config_expert.txt --n-threads 4
    ...
    

    Argument --test is mandatory for paralogs-ks and paralogs_ks_multi ksrates commands. More details in the Run example case as a manual pipeline section.

  3. Remove the cloned repository once the test is successful, or keep it if you want to run the example pipeline as well (see Run example case as a Nextflow pipeline (recommended)).

Updating your installation

  • To update the ksrates Nextflow pipeline to the latest release, run the following command:

    nextflow pull VIB-PSB/ksrates
    
  • To update the Docker container image, run the following command to pull the new image from Docker Hub:

    docker pull vibpsb/ksrates:latest
    
  • To update the Apptainer container image, first remove the old image (when using the ksrates Nextflow pipeline the image is stored in the cacheDir directory set in the nextflow.config or, if not set, by default in work/singularity in the project folder):

    rm vibpsb-ksrates-latest.img
    

    The next time the pipeline is launched, Nextflow will automatically pull the new image from Docker Hub.

Alternatively, run the following command in the same directory of the old image to manually pull the new image from Docker Hub:

apptainer pull vibpsb-ksrates-latest.img docker://vibpsb/ksrates:latest
  • To update your manual installation, uninstall the old version of ksrates package, clone the ksrates repository from GitHub and re-install the package:

    pip3 uninstall ksrates
    git clone https://github.com/VIB-PSB/ksrates
    # Starting from v2.0.0, when running outside a container, download also this compressed file:
    wget https://zenodo.org/records/15225340/files/original_angiosperm_sequences.tar.gz -P ksrates/ksrates/reciprocal_retention
    # Move to the ksrates subdirectory and install the package with ``pip``
    cd ksrates
    pip3 install .