Applying single-cell RNA-seq analysis

purlPURL: https://gxy.io/GTN:P00020
Comment: What is a Learning Pathway?
A graphic depicting a winding path from a start symbol to a trophy, with tutorials along the way
We recommend you follow the tutorials in the order presented on this page. They have been selected to fit together and build up your knowledge step by step. If a lesson has both slides and a tutorial, we recommend you start with the slides, then proceed with the tutorial.

Gone is the pre-annotated, high quality tutorial data - now you have real, messy data to deal with. You have decisions to make and parameters to decide. This learning pathway challenges you to replicate a published analysis as if this were your own dataset. You will be introduced to a few more tools available for scRNA-seq in Galaxy. Finally, if our tool offerings are not enough for you, you will be directed towards how to use coding notebooks within Galaxy, setting you up to analyse scRNA-seq in R or python notebooks.

The data is messy. The decisions are tough. The interpretation is meaningful. Come here to advance your single cell skills! Note that you get two options for inferring trajectories.

For support throughout these tutorials, join our Galaxy single cell chat group on Matrix to ask questions!

New to Galaxy and/or the field of scRNA-seq? Follow this learning path to get familiar with the basics!

Module 1: Preparing the dataset

This tutorial takes you from the large files containing raw scRNA sequencing reads to a smaller, combined cell matrix.

Time estimation: 3 hours

Learning Objectives
  • Generate a cellxgene matrix for droplet-based single cell sequencing data
  • Interpret quality control (QC) plots to make informed decisions on cell thresholds
  • Find relevant information in GTF files for the particulars of their study, and include this in data matrix metadata
  • Combine data matrices from different samples in the same experiment
  • Label the metadata for downstream processing
Lesson Slides Hands-on Recordings
Generating a single cell matrix using Alevin
Combining single cell datasets after pre-processing

Module 2: Generating cluster plots

These tutorials take you from the pre-processed matrix to cluster plots and gene expression values. You can pick whether to follow the Scanpy or Seurat tutorials - they will accomplish the same thing and generate the same results, so follow whichever you prefer!

Time estimation: 6 hours

Learning Objectives
  • Interpret quality control plots to direct parameter decisions
  • Repeat analysis from matrix to clustering
  • Identify decision-making points
  • Appraise data outputs and decisions
  • Explain why single cell analysis is an iterative (i.e. the first plots you generate are not final, but rather you go back and re-analyse your data repeatedly) process
  • Interpret quality control plots to direct parameter decisions
  • Repeat analysis from matrix to clustering to labelling clusters
  • Identify decision-making points
  • Appraise data outputs and decisions
  • Explain why single cell analysis is an iterative process (i.e. the first plots you generate are not final, but rather you go back and re-analyse your data repeatedly)
Lesson Slides Hands-on Recordings
Filter, plot and explore single-cell RNA-seq data with Scanpy
Filter, plot, and explore single cell RNA-seq data with Seurat

Module 3: Inferring trajectories

This isn’t strictly necessary, but if you want to infer trajectories - pseudotime relationships between cells - you can try out these tutorials with the same dataset. Again, you get two options for inferring trajectories, and you can choose either.

Time estimation: 5 hours

Learning Objectives
  • Execute multiple plotting methods designed to identify lineage relationships between cells
  • Interpret these plots
  • Identify which operations to perform on an AnnData object to obtain the files needed for Monocle
  • Follow the Monocle3 workflow and choose the right parameter values
  • Compare the outputs from Scanpy and Monocle
  • Interpet trajectory analysis results
Lesson Slides Hands-on Recordings
Inferring single cell trajectories with Scanpy
Inferring single cell trajectories with Monocle3

Module 4: Moving into coding environments

Did you know Galaxy can host coding environments? They don’t have the same level of computational power as the easy-to-use Galaxy tools, but you can unlock the full freedom in your data analysis. You can install your favourite single-cell tool suite that is not available on Galaxy, export your data into these coding environments and run your analysis there. If you want your favourite tool suite as a Galaxy tool, you can always request here. Let’s start with the basics of running these environments in Galaxy.

Time estimation: 4 hours 30 minutes

Learning Objectives
  • Launch JupyterLab in Galaxy
  • Start a notebook
  • Import libraries
  • Use get() to import datasets from your history to the notebook
  • Use put() to export datasets from the notebook to your history
  • Save your notebook into your history
  • Learn about the Jupyter Interactive Environment
  • Launch RStudio in Galaxy
Lesson Slides Hands-on Recordings
JupyterLab in Galaxy
Use Jupyter notebooks in Galaxy
RStudio in Galaxy

The End!

And now you’re done! If you are interested in trying out the case study analyses in a coding environment, try out our “Case study: Reloaded” series next! Otherwise, you will find more features, tips and tricks in our general Galaxy Single-cell Training page.


Editorial Board

This material is reviewed by our Editorial Board:

orcid logoWendi Bacon avatar Wendi Baconorcid logoPavankumar Videm avatar Pavankumar VidemPablo Moreno avatar Pablo Moreno