Dr Nadia Davidson

Detecting fusion genes in cancer with long read transcriptome sequencing: opportunities and challenges

Genomic rearrangements can result in the fusion of sequence from two or more genes, to create a new and potentially oncogenic gene. In cancer, many fusion genes are also important diagnostic markers and targets for therapy. Therefore, diagnosing fusions can inform clinical care and reveal the biology which drives the disease. Massively parallel short read RNA sequencing has greatly expanded our knowledge of fusions and long read sequencing offers to profile them with even great resolution as the full length of fusion transcripts can be discovered. However, the data generated by long read sequencing platforms, such as Oxford Nanopore Technologies (ONT) and Pacific Biosciences (PacBio), have a high error rate with many insertions and deletions. While numerous fusion finding algorithms have been developed for short read RNA sequencing data, few can be applied to long read data.

Here, I will examine the opportunities and challenges of using long read transcriptome sequencing for fusion detection. I will discuss how the current long read fusion finding methods overcome errors in the data. Using our own method JAFFAL, I will demonstrate the resolution which can be achieved with long read transcriptome sequence, namely the detection of complex muti-fusion transcripts in individual cells.

WORKSHOP: Detecting fusion gene in cancer with long read transcriptome sequencing and JAFFAL

Genomic rearrangements can result in fusion genes which drive cancer. Diagnosing fusions in patients can inform clinical care and improve our understanding of cancer biology. Long read transcriptome sequencing, such as Nanopore (ONT) and PacBio (PacBio) enable full length fusion transcripts to be sequenced. The generated data can then be analyses with JAFFAL to identify fusion genes and transcripts.

In this workshop we will go through the process of calling fusion genes in cancer from long read sequencing data using JAFFAL. We will walk through the steps of installing JAFFAL and apply it to a demonstration ONT datasets. We will discuss the different output files that are generated and how the fusion results can be interpreted. In particular, we will look at different types of background events and how these can be distinguished from true fusions. We will also show how three gene fusions can be identified and how fusions can be called in long read single cell data.

Keywords: Fusions, translocations, rearrangements, cancer, RNA-Seq, long reads, ONT, PacBio, transcriptome

Requirements: Access to a linux server or cluster where JAFFAL can be run is highly recommended, otherwise a linux or mac laptop or desktop will be required. Internet access will also be needed. Familiarity with linux (e.g. bash) is assumed.

Relevance: The workshop is likely to be of interest to those working with long read transcriptome sequencing of cancer samples. Much of the workshop will also be relevant to those wanting to learn how to call fusions in short read data with JAFFA.

Dr Nadia Davidson

Senior Postdoctoral Researcher, The Peter MacCallum Cancer Centre

Dr. Nadia Davidson is a bioinformatician and senior postdoctoral researcher at the Peter MacCallum Cancer Centre in Melbourne. In 2022 she will move to the Walter and Eliza Hall Institute to establish a new research group within the Blood Cell and Blood Cancer Division. She received a PhD in Particle Physics from the University of Melbourne in 2011 and went on to retraining in bioinformatics under the supervision of Prof. Alicia Oshlack. Nadia’s science in centered on developing methods for RNA sequencing data analysis. She has built novel methods for cancer research, such as fusion and transcribed structural variant finders, for short and long read transcriptome sequencing data. Her past research has also informed practices for transcriptome analysis in non-model organisms. She received the Australian Bioinformatics and Computational Society Early Career Researcher award in 2019.