View markdown source on GitHub

Connecting Galaxy to a compute cluster




last_modification Published: Jan 7, 2018
last_modification Last Updated: Apr 20, 2023

Galaxy Job Configuration

Speaker Notes

Why cluster?

Running jobs on the Galaxy server negatively impacts Galaxy UI performance

Even adding one other host helps

Can restart Galaxy without interrupting jobs

Speaker Notes


Correspond to job runner plugins in lib/galaxy/jobs/runners

.left[Plugins for:]

Speaker Notes

Cluster library stack (DRMAA)

Cluster library stack

Speaker Notes


Control how jobs are assigned to handlers (use db-skip-locked)

Can statically define handler configuration (uncommon)

Speaker Notes


Formerly “Destinations”

Define how jobs should be run

Speaker Notes

The default job configuration

    workers: 4

  default: local
      runner: local

Speaker Notes

Job Config - Tags

Both environments and handlers can be grouped by tags

Speaker Notes

Job Environment

env key in environments: configure the job execution environment

syntax function
- {name: NAME, value: VALUE} Set $NAME to VALUE
- {file: /path/to/file} Source shell file at /path/to/file
- {execute: CMD} Execute CMD

Source and command execution will be handled on the remote destination, don’t need to work on the Galaxy server

Speaker Notes


Available limits

Speaker Notes

Concurrency Limits

Available limits

Speaker Notes

Shared Filesystem

Most job plugins require a shared filesystem between the Galaxy server and compute.

The exception is Pulsar. More on this in Running Jobs on Remote Resources with Pulsar.

Speaker Notes

Shared Filesystem

Our simple example works because of two important principles:

  1. Some things are located at the same path on Galaxy server and node(s)
    • Galaxy application (/srv/galaxy/server)
    • Tool dependencies
  2. Some things are the same on Galaxy server and node(s)
    • Job working directory
    • Input and output datasets

The first can be worked around with symlinks, copies, or Pulsar embedded

The second can be worked around with Pulsar REST/MQ (with a performance/throughput penalty)

Speaker Notes


Some tools can greatly improve performance by using multiple cores

Galaxy automatically sets $GALAXY_SLOTS to the CPU/core count you specify when submitting, for example, 4:

Tool configs: Consume \${GALAXY_SLOTS:-4}

Speaker Notes

Memory requirements

For Slurm and Gridengine only, Galaxy will set $GALAXY_MEMORY_MB and $GALAXY_MEMORY_MB_PER_SLOT as integers.

Other DRMs: Please PR the appropriate code.

For Java tools, be sure to set -Xmx, e.g.:

      runner: drmaa
        - name: '_JAVA_OPTIONS'
          value: '-Xmx6G'

Speaker Notes

Run jobs as the “real” user

If your Galaxy users == System users:

See: Cluster documentation

Speaker Notes

Job Config - Mapping Tools to Environments

Problem: Tool A uses single core, Tool B uses multiple

Speaker Notes

Job Config - Mapping Tools to Environments


  default: singlecore_slurm
      runner: slurm

      runner: slurm
      native_specification: '--ntasks=4'
- id: hisat2
  handler: multicore_slurm

Speaker Notes

The Dynamic Job Runner

For when basic tool-to-environment mapping isn’t enough

Speaker Notes

The Dynamic Job Runner

A special built-in job runner plugin

Map jobs to destinations on more than just tool IDs

.left[Two types:]

See: Dynamic Destination Mapping

Speaker Notes

Total Perspective Vortex (TPV)

Powerful, fully dynamic tool-to-environment mapping based on tool, user, resource requirements, tags, and more.

Discussed in detail in its own tutorial.

See also: TPV Documentation.

Arbitrary Python Functions

.left[Programmable mappings:]

Speaker Notes

Key Points

Thank you!

This material is the result of a collaborative work. Thanks to the Galaxy Training Network and all the contributors! page logo Tutorial Content is licensed under Creative Commons Attribution 4.0 International License.