+ - 0:00:00
Notes for current slide

Presenter notes contain extra information which might be useful if you intend to use these slides for teaching.

Press P again to switch presenter notes off

Press C to create a new window where the same presentation will be displayed. This window is linked to the main window. Changing slides on one will cause the slide to change on the other.

Useful when presenting.

Notes for next slide



Submitting SARS-CoV-2 sequences to ENA



last_modification Updated:   purlPURL: gxy.io/GTN:S00064

text-document Plain-text slides |

Tip: press P to view the presenter notes | arrow-keys Use arrow keys to move between slides
1 / 18

Presenter notes contain extra information which might be useful if you intend to use these slides for teaching.

Press P again to switch presenter notes off

Press C to create a new window where the same presentation will be displayed. This window is linked to the main window. Changing slides on one will cause the slide to change on the other.

Useful when presenting.

Requirements

Before diving into this slide deck, we recommend you to have a look at:

2 / 18

objectives Objectives

  • Introduce the European Nucleotide Archive (ENA)

  • Learn the requirements to submit raw SARS-CoV-2 sequences to ENA in Galaxy

  • Overview ENA's metadata model and how metadata objects are linked

3 / 18

The European Nucleotide Archive

ENA is:

  • a FAIR and Open repository for sequence data (reads, assemblies, annotations)
  • part of the International Nucleotide Sequence Database Collaboration (INSDC) with NCBI and DDJB
  • the COVID-19 data portal repository for SARS-CoV-2 sequences

ENA-FAIR

The European Nucleotide Archive and INSDC

4 / 18

SARS-CoV-2 sequences

Why is raw SARS-CoV-2 sequence data important?

  • Allows reuse of data and reproducibility of analysis
  • Enables discovery of minor allelic variants and intrahost variation

Intrahost variation

Minor allelic-variants can be used to detect intrahost variation. From Maier et al., 2021 doi.org/10.1101/2021.03.25.437046

5 / 18

Submitting reads with Galaxy

Why use Galaxy to submit to ENA?

  • intuitive graphical user interface (GUI)
  • simple metadata input via a template spreadsheet or interactively
  • no bioinformatics skills needed

upload-tool

6 / 18

Submission overview

reads-submission

7 / 18

What you need

Data:

  • compressed fastq format (.fastq.gz, .fastq.bz2)
  • human traces removed (tutorial)

Metadata:

  • interactive metadata input (for a few submissions) or;
  • metadata template spreadsheet (for bulk submissions)

Credentials:

ENA-credentials

8 / 18

Metadata

For the submission of SARS-CoV-2 reads ENA's metadata model requires:

  • study, sample, experiment and run information
  • additional information for viral samples (viral checklist)
metadata-model
9 / 18

Metadata

Interactive metadata input in Galaxy:

interactive metadata
10 / 18

Metadata

Metadata template spreadsheet:

  • one sheet each for study, sample, experiment and run
  • built-in controlled vocabulary
metadata_template
11 / 18

Metadata

  • Different metadata objects are linked using Aliases
  • Aliases must be unique
metadata-model
12 / 18

Aliases

Aliases link metadata objects:

  • Experiments are linked to Study and Samples
  • Runs are linked to Experiments
study-sample
13 / 18

Aliases

Aliases link metadata objects:

  • Experiments are linked to Study and Samples
  • Runs are linked to Experiments
exp-run
14 / 18

Aliases

Aliases link metadata to data:

  • Data (filename.fastq.gz) is linked to Run Alias
data-metadata
15 / 18
16 / 18

keypoints Key points

  • ENA is a FAIR data repository for SARS-CoV-2 raw and assembled nucleotide data

  • You can easily submit reads to ENA using Galaxy's ENA upload tool (GUI, no bioinformatic skills needed)

17 / 18

Thank You!

This material is the result of a collaborative work. Thanks to the Galaxy Training Network and all the contributors!

Author(s) Miguel Roncoroni avatar Miguel Roncoroni
Reviewers Björn Grüning avatarBeatriz Serrano-Solano avatar
Galaxy Training Network

Tutorial Content is licensed under Creative Commons Attribution 4.0 International License.

18 / 18

Requirements

Before diving into this slide deck, we recommend you to have a look at:

2 / 18
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow