Steven Jones | Genome Sciences Centre

Dr. Jones’ research program is firmly entrenched in genome science to better understand the complete mutational landscape of cancers. His primary aim is to help uncover the diversity of genetic and genomic events that accrue to give rise to cancers, and which also encourage their evolution and maintain their progression. His laboratory extensively analyzes Next Generation genome and transcriptome data to achieve these goals. Dr. Jones has developed a number of novel computational approaches and methodologies to this end and has provided numerous insights into cancer dynamics, potential biomarkers and therapeutic targets. A significant part of Dr. Jones research program relates to developing more precise cancer treatments by exploiting an individual’s specific cancer genome profile. His research has identified numerous epigenetic targets that have the potential to be modulated in such a way as to reverse the effects of mutations within a cancer genome. Using computational approaches, his research team has identified and refined compounds that modify epigenetic programs in cancer. His laboratory also acts as a data analysis centre for the Canadian Epigenetics, Environment and Health Research Consortium (CEEHRC).

In 2005, Dr. Jones was identified as one of Canada's top 40 professionals under 40 by Caldwell Partners International as well as by Business in Vancouver. He has received the Spencer Award for IT innovation as well as the 2007 Medical Genetics teaching award from UBC. He is a founding director of the CIHR/MSFHR Bioinformatics Training Program as well as director of the UBC Bioinformatics Graduate Program. In 2011, he was inducted as a Fellow of the Royal Society of Canada for his contributions to Genomics and Bioinformatics, and in 2012 he was a recipient of the prestigious UBC Killam teaching prize in recognition of his contributions to graduate bioinformatic education. In May of 2014 Dr. Jones was awarded the Distinguished Achievement Award by the Faculty of Medicine at UBC and in June 2014 he became a Fellow of the Canadian Academy of Health Sciences. He was recognized by Clarivate Analytics in 2016 and 2018 as among the world’s most highly cited researchers in his field. Dr Jones was born in Wales and trained at Bristol University, Simon Fraser University and the Sanger Institute.

Dr Jones is a founder of Ifowonco Bioinformatics Inc., Caldey Informatics Inc., Evident Genomics Inc. and Alamya Health, PBC. He is an advisor to Everyone.bio Inc., Outpost Biosciences Inc. and OncoInnovations Inc. He has received travel funding for speaking engagements from Illumina Inc. and Oxford Nanopore Technologies PLC.

Dr. Jones' complete CV is available here.

Affiliations

Canada Research Chair in Computational Genomics, University of British Columbia
Professor, Medical Genetics, University of British Columbia
Director Bioinformatics, Genome BC Bioinformatics Platform, Genome BC
Founding Director, CIHR/MSFHR Bioinformatics Training Program
Director, Bioinformatics Graduate Program, University of British Columbia
Professor, Genetics Graduate Program, University of British Columbia
Associate Member, Peter Wall Institute for Advanced Studies
Associate Member, Michael Smith Laboratories, University of British Columbia

Credentials

BSc. (Hons), Biochemistry, Bristol University, UK 1990
MSc., Genetics, Simon Fraser University, BC 1994
PhD, Bioinformatics, Sanger Institute, Cambridge, UK 1999

Projects

Canada BioGenome Project

The Canada BioGenome Project is Canada's contribution to the Earth BioGenome Project—a global effort to "sequence life for the future of life". Through the Canada BioGenome Project, the GSC joins a collaborate effort to better understand and conserve our natural heritage through open access reference genomes for 400 iconic Canadian Species.

Learn more.

Learn more about Canada BioGenome Project

Cancer Bioinformatics and Genomics

Dr. Jones' team uses bioinformatics to investigate the landscape of mutations present in cancer genomes and the early genomic events that give rise to and promote the progression of cancer.

Learn more about Cancer Bioinformatics and Genomics

Selected Publications

Assembling the 20 Gb white spruce (Picea glauca) genome from whole-genome shotgun sequencing data.

Bioinformatics (Oxford, England), 2013

Birol, Inanc, Raymond, Anthony, Jackman, Shaun D, Pleasance, Stephen, Coope, Robin, Taylor, Greg A, Yuen, Macaire Man Saint, Keeling, Christopher I, Brand, Dana, Vandervalk, Benjamin P, Kirk, Heather, Pandoh, Pawan, Moore, Richard A, Zhao, Yongjun, Mungall, Andrew J, Jaquish, Barry, Yanchuk, Alvin, Ritland, Carol, Boyle, Brian, Bousquet, Jean, Ritland, Kermit, Mackay, John, Bohlmann, Jörg, Jones, Steven J M

White spruce (Picea glauca) is a dominant conifer of the boreal forests of North America, and providing genomics resources for this commercially valuable tree will help improve forest management and conservation efforts. Sequencing and assembling the large and highly repetitive spruce genome though pushes the boundaries of the current technology. Here, we describe a whole-genome shotgun sequencing strategy using two Illumina sequencing platforms and an assembly approach using the ABySS software. We report a 20.8 giga base pairs draft genome in 4.9 million scaffolds, with a scaffold N50 of 20,356 bp. We demonstrate how recent improvements in the sequencing technology, especially increasing read lengths and paired end reads from longer fragments have a major impact on the assembly contiguity. We also note that scalable bioinformatics tools are instrumental in providing rapid draft assemblies.

The Genome of the Beluga Whale (Delphinapterus leucas).

Genes, 2017

Jones, Steven J M, Taylor, Gregory A, Chan, Simon, Warren, René L, Hammond, S Austin, Bilobram, Steven, Mordecai, Gideon, Suttle, Curtis A, Miller, Kristina M, Schulze, Angela, Chan, Amy M, Jones, Samantha J, Tse, Kane, Li, Irene, Cheung, Dorothy, Mungall, Karen L, Choo, Caleb, Ally, Adrian, Dhalla, Noreen, Tam, Angela K Y, Troussard, Armelle, Kirk, Heather, Pandoh, Pawan, Paulino, Daniel, Coope, Robin J N, Mungall, Andrew J, Moore, Richard, Zhao, Yongjun, Birol, Inanc, Ma, Yussanne, Marra, Marco, Haulena, Martin

The beluga whale is a cetacean that inhabits arctic and subarctic regions, and is the only living member of the genus . The genome of the beluga whale was determined using DNA sequencing approaches that employed both microfluidic partitioning library and non-partitioned library construction. The former allowed for the construction of a highly contiguous assembly with a scaffold N50 length of over 19 Mbp and total reconstruction of 2.32 Gbp. To aid our understanding of the functional elements, transcriptome data was also derived from brain, duodenum, heart, lung, spleen, and liver tissue. Assembled sequence and all of the underlying sequence data are available at the National Center for Biotechnology Information (NCBI) under the Bioproject accession number PRJNA360851A.

Evolution of an adenocarcinoma in response to selection by targeted kinase inhibitors.

Genome biology, 2010

Jones, Steven Jm, Laskin, Janessa, Li, Yvonne Y, Griffith, Obi L, An, Jianghong, Bilenky, Mikhail, Butterfield, Yaron S, Cezard, Timothee, Chuah, Eric, Corbett, Richard, Fejes, Anthony P, Griffith, Malachi, Yee, John, Martin, Montgomery, Mayo, Michael, Melnyk, Nataliya, Morin, Ryan D, Pugh, Trevor J, Severson, Tesa, Shah, Sohrab P, Sutcliffe, Margaret, Tam, Angela, Terry, Jefferson, Thiessen, Nina, Thomson, Thomas, Varhol, Richard, Zeng, Thomas, Zhao, Yongjun, Moore, Richard A, Huntsman, David G, Birol, Inanc, Hirst, Martin, Holt, Robert A, Marra, Marco A

Adenocarcinomas of the tongue are rare and represent the minority (20 to 25%) of salivary gland tumors affecting the tongue. We investigated the utility of massively parallel sequencing to characterize an adenocarcinoma of the tongue, before and after treatment.

Circos: an information aesthetic for comparative genomics.

Genome research, 2009

Krzywinski, Martin, Schein, Jacqueline, Birol, Inanç, Connors, Joseph, Gascoyne, Randy, Horsman, Doug, Jones, Steven J, Marra, Marco A

We created a visualization tool called Circos to facilitate the identification and analysis of similarities and differences arising from comparisons of genomes. Our tool is effective in displaying variation in genome structure and, generally, any other kind of positional relationships between genomic intervals. Such data are routinely produced by sequence alignments, hybridization arrays, genome mapping, and genotyping studies. Circos uses a circular ideogram layout to facilitate the display of relationships between pairs of positions by the use of ribbons, which encode the position, size, and orientation of related genomic elements. Circos is capable of displaying data as scatter, line, and histogram plots, heat maps, tiles, connectors, and text. Bitmap or vector images can be created from GFF-style data inputs and hierarchical configuration files, which can be easily generated by automated tools, making Circos suitable for rapid deployment in data analysis and reporting pipelines.

Sources of erroneous sequences and artifact chimeric reads in next generation sequencing of genomic DNA from formalin-fixed paraffin-embedded samples.

Nucleic acids research, 2019

Haile, Simon, Corbett, Richard D, Bilobram, Steve, Bye, Morgan H, Kirk, Heather, Pandoh, Pawan, Trinh, Eva, MacLeod, Tina, McDonald, Helen, Bala, Miruna, Miller, Diane, Novik, Karen, Coope, Robin J, Moore, Richard A, Zhao, Yongjun, Mungall, Andrew J, Ma, Yussanne, Holt, Rob A, Jones, Steven J, Marra, Marco A

Tissues used in pathology laboratories are typically stored in the form of formalin-fixed, paraffin-embedded (FFPE) samples. One important consideration in repurposing FFPE material for next generation sequencing (NGS) analysis is the sequencing artifacts that can arise from the significant damage to nucleic acids due to treatment with formalin, storage at room temperature and extraction. One such class of artifacts consists of chimeric reads that appear to be derived from non-contiguous portions of the genome. Here, we show that a major proportion of such chimeric reads align to both the 'Watson' and 'Crick' strands of the reference genome. We refer to these as strand-split artifact reads (SSARs). This study provides a conceptual framework for the mechanistic basis of the genesis of SSARs and other chimeric artifacts along with supporting experimental evidence, which have led to approaches to reduce the levels of such artifacts. We demonstrate that one of these approaches, involving S1 nuclease-mediated removal of single-stranded fragments and overhangs, also reduces sequence bias, base error rates, and false positive detection of copy number and single nucleotide variants. Finally, we describe an analytical approach for quantifying SSARs from NGS data.

Comprehensive genomic profiling of glioblastoma tumors, BTICs, and xenografts reveals stability and adaptation to growth environments.

Proceedings of the National Academy of Sciences of the United States of America, 2019

Shen, Yaoqing, Grisdale, Cameron J, Islam, Sumaiya A, Bose, Pinaki, Lever, Jake, Zhao, Eric Y, Grinshtein, Natalie, Ma, Yussanne, Mungall, Andrew J, Moore, Richard A, Lun, Xueqing, Senger, Donna L, Robbins, Stephen M, Wang, Alice Yijun, MacIsaac, Julia L, Kobor, Michael S, Luchman, H Artee, Weiss, Samuel, Chan, Jennifer A, Blough, Michael D, Kaplan, David R, Cairncross, J Gregory, Marra, Marco A, Jones, Steven J M

Glioblastoma multiforme (GBM) is the most deadly brain tumor, and currently lacks effective treatment options. Brain tumor-initiating cells (BTICs) and orthotopic xenografts are widely used in investigating GBM biology and new therapies for this aggressive disease. However, the genomic characteristics and molecular resemblance of these models to GBM tumors remain undetermined. We used massively parallel sequencing technology to decode the genomes and transcriptomes of BTICs and xenografts and their matched tumors in order to delineate the potential impacts of the distinct growth environments. Using data generated from whole-genome sequencing of 201 samples and RNA sequencing of 118 samples, we show that BTICs and xenografts resemble their parental tumor at the genomic level but differ at the mRNA expression and epigenomic levels, likely due to the different growth environment for each sample type. These findings suggest that a comprehensive genomic understanding of in vitro and in vivo GBM model systems is crucial for interpreting data from drug screens, and can help control for biases introduced by cell-culture conditions and the microenvironment in mouse models. We also found that lack of expression in pretreated GBM is linked to hypermutation, which in turn contributes to increased genomic heterogeneity and requires new strategies for GBM treatment.

Genomic characterization of a well-differentiated grade 3 pancreatic neuroendocrine tumor.

Cold Spring Harbor molecular case studies, 2019

Williamson, Laura M, Steel, Michael, Grewal, Jasleen K, Thibodeau, My Lihn, Zhao, Eric Y, Loree, Jonathan M, Yang, Kevin C, Gorski, Sharon M, Mungall, Andrew J, Mungall, Karen L, Moore, Richard A, Marra, Marco A, Laskin, Janessa, Renouf, Daniel J, Schaeffer, David F, Jones, Steven J M

Pancreatic neuroendocrine neoplasms (PanNENs) represent a minority of pancreatic neoplasms that exhibit variability in prognosis. Ongoing mutational analyses of PanNENs have found recurrent abnormalities in chromatin remodeling genes (e.g., and ), and mTOR pathway genes (e.g., , , and ), some of which have relevance to patients with related familial syndromes. Most recently, grade 3 PanNENs have been divided into two groups based on differentiation, creating a new group of well-differentiated grade 3 neuroendocrine tumors (PanNETs) that have had a limited whole-genome level characterization to date. In a patient with a metastatic well-differentiated grade 3 PanNET, our study utilized whole-genome sequencing of liver metastases for the comparative analysis and detection of single-nucleotide variants, insertions and deletions, structural variants, and copy-number variants, with their biologic relevance confirmed by RNA sequencing. We found that this tumor most notably exhibited a -disrupting fusion, showed a novel fusion, and lacked any somatic variants in , , and .

Gene Fusions Are Recurrent, Clinically Actionable Gene Rearrangements in Wild-Type Pancreatic Ductal Adenocarcinoma.

Clinical cancer research : an official journal of the American Association for Cancer Research, 2019

Jones, Martin R, Williamson, Laura M, Topham, James T, Lee, Michael K C, Goytain, Angela, Ho, Julie, Denroche, Robert E, Jang, GunHo, Pleasance, Erin, Shen, Yaoquing, Karasinska, Joanna M, McGhie, John P, Gill, Sharlene, Lim, Howard J, Moore, Malcolm J, Wong, Hui-Li, Ng, Tony, Yip, Stephen, Zhang, Wei, Sadeghi, Sara, Reisle, Carolyn, Mungall, Andrew J, Mungall, Karen L, Moore, Richard A, Ma, Yussanne, Knox, Jennifer J, Gallinger, Steven, Laskin, Janessa, Marra, Marco A, Schaeffer, David F, Jones, Steven J M, Renouf, Daniel J

Gene fusions involving neuregulin 1 () have been noted in multiple cancer types and have potential therapeutic implications. Although varying results have been reported in other cancer types, the efficacy of the HER-family kinase inhibitor afatinib in the treatment of fusion-positive pancreatic ductal adenocarcinoma is not fully understood.

Application of a Neural Network Whole Transcriptome-Based Pan-Cancer Method for Diagnosis of Primary and Metastatic Cancers.

JAMA network open, 2019

Grewal, Jasleen K, Tessier-Cloutier, Basile, Jones, Martin, Gakkhar, Sitanshu, Ma, Yussanne, Moore, Richard, Mungall, Andrew J, Zhao, Yongjun, Taylor, Michael D, Gelmon, Karen, Lim, Howard, Renouf, Daniel, Laskin, Janessa, Marra, Marco, Yip, Stephen, Jones, Steven J M

A molecular diagnostic method that incorporates information about the transcriptional status of all genes across multiple tissue types can strengthen confidence in cancer diagnosis.

Long-read sequencing for detection and subtyping of Prader-Willi and Angelman syndromes.

Journal of medical genetics, 2024

Akbari, Vahid, Dada, Sarah, Shen, Yaoqing, Dixon, Katherine, Hejla, Duha, Galbraith, Andrew, Choufani, Sanaa, Weksberg, Rosanna, Boerkoel, Cornelius F, Stewart, Laura, Gibson, William T, Jones, Steven J M

Prader-Willi syndrome (PWS) and Angelman syndrome (AS) are imprinting disorders caused by genetic or epigenetic aberrations of 15q11.2-q13. Their clinical testing is often multitiered; diagnostic testing begins with methylation-specific multiplex ligation-dependent probe amplification or methylation-sensitive PCR and then proceeds to molecular subtyping to determine the mechanism and recurrence risk. Currently, correct classification of a proband's PWS/AS subtype often requires parental samples, a costly process for families and health systems. The use of nanopore sequencing for molecular diagnosis of PWS and AS has been explored by Yamada ; however, to confirm heterodisomy parental data were still required. Here, we investigate genome-wide nanopore sequencing in a larger cohort of PWS (18) and AS (6) as a singular test to detect the molecular subtype, without parental data. We accurately subtyped these cases including uniparental heterodisomy, mixed iso-/heterodisomy, type 1 and 2 deletions, microdeletion and indels. One PWS case with a previously unresolved diagnosis subtyped as maternal isodisomy. This work highlights the application of long-read sequencing and other imprinted regions outside of the PWS/AS critical region to resolve the molecular diagnosis and subtyping of PWS and AS without parental data. The work also outlines an approach to generically detect heterodisomy through the interrogation of distant imprinted regions.

Long-read sequencing of an advanced cancer cohort resolves rearrangements, unravels haplotypes, and reveals methylation landscapes.

Cell genomics, 2024

O'Neill, Kieran, Pleasance, Erin, Fan, Jeremy, Akbari, Vahid, Chang, Glenn, Dixon, Katherine, Csizmok, Veronika, MacLennan, Signe, Porter, Vanessa, Galbraith, Andrew, Grisdale, Cameron J, Culibrk, Luka, Dupuis, John H, Corbett, Richard, Hopkins, James, Bowlby, Reanne, Pandoh, Pawan, Smailus, Duane E, Cheng, Dean, Wong, Tina, Frey, Connor, Shen, Yaoqing, Lewis, Eleanor, Paulin, Luis F, Sedlazeck, Fritz J, Nelson, Jessica M T, Chuah, Eric, Mungall, Karen L, Moore, Richard A, Coope, Robin, Mungall, Andrew J, McConechy, Melissa K, Williamson, Laura M, Schrader, Kasmintan A, Yip, Stephen, Marra, Marco A, Laskin, Janessa, Jones, Steven J M

The Long-Read Personalized OncoGenomics (POG) dataset comprises a cohort of 189 patient tumors and 41 matched normal samples sequenced using the Oxford Nanopore Technologies PromethION platform. This dataset from the POG program and the Marathon of Hope Cancer Centres Network includes DNA and RNA short-read sequence data, analytics, and clinical information. We show the potential of long-read sequencing for resolving complex cancer-related structural variants, viral integrations, and extrachromosomal circular DNA. Long-range phasing facilitates the discovery of allelically differentially methylated regions (aDMRs) and allele-specific expression, including recurrent aDMRs in the cancer genes RET and CDKN2A. Germline promoter methylation in MLH1 can be directly observed in Lynch syndrome. Promoter methylation in BRCA1 and RAD51C is a likely driver behind homologous recombination deficiency where no coding driver mutation was found. This dataset demonstrates applications for long-read sequencing in precision medicine and is available as a resource for developing analytical approaches using this technology.

A second update on mapping the human genetic architecture of COVID-19.

Nature, 2023

The genome sequence of the Loggerhead sea turtle, Linnaeus 1758.

F1000Research, 2023

Chang, Glenn, Jones, Samantha, Leelakumari, Sreeja, Ashkani, Jahanshah, Culibrk, Luka, O'Neill, Kieran, Tse, Kane, Cheng, Dean, Chuah, Eric, McDonald, Helen, Kirk, Heather, Pandoh, Pawan, Pari, Sauro, Angelini, Valeria, Kyle, Christopher, Bertorelle, Giorgio, Zhao, Yongjun, Mungall, Andrew, Moore, Richard, Vilaça, Sibelle, Jones, Steven

We present a genome assembly of (the Loggerhead sea turtle; Chordata, Testudines, Cheloniidae), generated from genomic data from two unrelated females. The genome sequence is 2.13 gigabases in size. The assembly has a busco completion score of 96.1% and N50 of 130.95 Mb. The majority of the assembly is scaffolded into 28 chromosomal representations with a remaining 2% of the assembly being excluded from these.

Defining the heterogeneity of unbalanced structural variation underlying breast cancer susceptibility by nanopore genome sequencing.

European journal of human genetics : EJHG, 2023

Dixon, Katherine, Shen, Yaoqing, O'Neill, Kieran, Mungall, Karen L, Chan, Simon, Bilobram, Steve, Zhang, Wei, Bezeau, Marjorie, Sharma, Alshanee, Fok, Alexandra, Mungall, Andrew J, Moore, Richard, Bosdet, Ian, Thibodeau, My Linh, Sun, Sophie, Yip, Stephen, Schrader, Kasmintan A, Jones, Steven J M

Germline structural variants (SVs) are challenging to resolve by conventional genetic testing assays. Long-read sequencing has improved the global characterization of SVs, but its sensitivity at cancer susceptibility loci has not been reported. Nanopore long-read genome sequencing was performed for nineteen individuals with pathogenic copy number alterations in BRCA1, BRCA2, CHEK2 and PALB2 identified by prior clinical testing. Fourteen variants, which spanned single exons to whole genes and included a tandem duplication, were accurately represented. Defining the precise breakpoints of SVs in BRCA1 and CHEK2 revealed unforeseen allelic heterogeneity and informed the mechanisms underlying the formation of recurrent deletions. Integrating read-based and statistical phasing further helped define extended haplotypes associated with founder alleles. Long-read sequencing is a sensitive method for characterizing private, recurrent and founder SVs underlying breast cancer susceptibility. Our findings demonstrate the potential for nanopore sequencing as a powerful genetic testing assay in the hereditary cancer setting.

Parent-of-origin detection and chromosome-scale haplotyping using long-read DNA methylation sequencing and Strand-seq.

Cell genomics, 2023

Akbari, Vahid, Hanlon, Vincent C T, O'Neill, Kieran, Lefebvre, Louis, Schrader, Kasmintan A, Lansdorp, Peter M, Jones, Steven J M

Hundreds of loci in human genomes have alleles that are methylated differentially according to their parent of origin. These imprinted loci generally show little variation across tissues, individuals, and populations. We show that such loci can be used to distinguish the maternal and paternal homologs for all human autosomes without the need for the parental DNA. We integrate methylation-detecting nanopore sequencing with the long-range phase information in Strand-seq data to determine the parent of origin of chromosome-length haplotypes for both DNA sequence and DNA methylation in five trios with diverse genetic backgrounds. The parent of origin was correctly inferred for all autosomes with an average mismatch error rate of 0.31% for SNVs and 1.89% for insertions or deletions (indels). Because our method can determine whether an inherited disease allele originated from the mother or the father, we predict that it will improve the diagnosis and management of many genetic diseases.