Dr. Jones' team uses bioinformatics to investigate the landscape of mutations present in cancer genomes and the early genomic events that give rise to and promote the progression of cancer. To achieve these goals, his laboratory analyzes Next Generation Sequencing data and develops novel computational approaches and methodologies. A significant aim of Dr. Jones’ research program is to find innovative ways to exploit specific genomic profiles within an individual cancer for therapeutic purposes. For example, his team has identified a number of epigenetic modifications that may potentially be targeted to reverse the effects of cancer initiating mutations. His lab is using computational approaches, such as molecular docking and molecular dynamics, to identify and refine compounds that can modify the cancer epigenome. 



The Jones Lab is located at Canada's Michael Smith Genome Sciences Centre, Echelon Technology Platform.
570 West 7th Avenue 
Vancouver, British Columbia 
V5Z 4S6 


Selected Publications

Genome-wide detection of imprinted differentially methylated regions using nanopore sequencing

Vahid Akbari, Jean-Michel Garant, Kieran O'Neill, Pawan Pandoh, Richard Moore, Marco A Marra, Martin Hirst, Steven JM Jones.

Imprinting is a critical part of normal embryonic development in mammals, controlled by defined parent-of-origin (PofO) differentially methylated regions (DMRs) known as imprinting control regions. Direct nanopore sequencing of DNA provides a means to detect allelic methylation and to overcome the drawbacks of methylation array and short-read technologies. Here, we used publicly available nanopore sequencing data for 12 standard B-lymphocyte cell lines to acquire the genome-wide mapping of imprinted intervals in humans. Using the sequencing data, we were able to phase 95% of the human methylome and detect 94% of the previously well-characterized, imprinted DMRs. In addition, we found 42 novel imprinted DMRs (16 germline and 26 somatic), which were confirmed using whole-genome bisulfite sequencing (WGBS) data. Analysis of WGBS data in mouse (Mus musculus), rhesus monkey (Macaca mulatta), and chimpanzee (Pan troglodytes) suggested that 17 of these imprinted DMRs are conserved. Some of the novel imprinted intervals are within or close to imprinted genes without a known DMR. We also detected subtle parental methylation bias, spanning several kilobases at seven known imprinted clusters. At these blocks, hypermethylation occurs at the gene body of expressed allele(s) with mutually exclusive H3K36me3 and H3K27me3 allelic histone marks. These results expand upon our current knowledge of imprinting and the potential of nanopore sequencing to identify imprinting regions using only parent-offspring trios, as opposed to the large multi-generational pedigrees that have previously been required.

A community approach to the cancer-variant-interpretation bottleneck

Nature Cancer
Krysiak K, Danos AM, Kiwala S, McMichael JF, Coffman AC, Barnell EK, Sheta L, Saliba J, Grisdale CJ, Kujan L, Pema S, Lever J, Spies NC, Chiorean A, Rieke DT, Clark KA, Jani P, Takahashi H, Horak P, Ritter DI, Zhou X, Ainscough BJ, Delong S, Lamping M, Marr AR, Li BV, Lin WH, Terraf P, Salama Y, Campbell KM, Farncombe KM, Ji J, Zhao X, Xu X, Kanagal-Shamanna R, Cotto KC, Skidmore ZL, Walker JR, Zhang J, Milosavljevic A, Patel RY, Giles RH, Kim RH, Schriml LM, Mardis ER, Jones SJM, Raca G, Rao S, Madhavan S, Wagner AH, Griffith OL, Griffith M.

As guidelines, therapies and literature on cancer variants expand, the lack of consensus variant interpretations impedes clinical applications. CIViC is a public-domain, crowd-sourced and adaptable knowledgebase of evidence for the clinical interpretation of variants in cancer, designed to reduce barriers to knowledge sharing and alleviate the variant-interpretation bottleneck.

Whole-genome and transcriptome analysis of advanced adrenocortical cancer highlights multiple alterations affecting epigenome and DNA repair pathways

Cold Spring Harbor Molecular Case Studies
Jean-Michel Lavoie, Veronika Csizmok, Laura M Williamson, Luka Culibrk, Gang Wang, Marco A Marra, Janessa Laskin, Steven JM Jones, Daniel J Renouf, Christian K Kollmannsberger.

Adrenocortical cancer (ACC) is a rare cancer of the adrenal gland. Several driver mutations have been identified in both primary and metastatic ACCs, but the therapeutic options are still limited. We performed whole-genome and transcriptome sequencing on seven patients with metastatic ACC. Integrative analysis of mutations, RNA expression changes, mutation signature, and homologous recombination deficiency (HRD) analysis was performed. Mutations affecting CTNNB1 and TP53 and frequent loss of heterozygosity (LOH) events were observed in our cohort. Alterations affecting genes involved in cell cycle (RB1CDKN2ACDKN2B), DNA repair pathways (MUTYHBRCA2ATMRAD52MLH1MSH6), and telomere maintenance (TERF2 and TERT) consisting of somatic and germline mutations, structural variants, and expression outliers were also observed. HRDetect, which aggregates six HRD-associated mutation signatures, identified a subset of cases as HRD. Genomic alterations affecting genes involved in epigenetic regulation were also identified, including structural variants (SWI/SNF genes and histone methyltransferases), and copy gains and concurrent high expression of KDM5A, which may contribute to epigenomic deregulation. Findings from this study highlight HRD and epigenomic pathways as potential therapeutic targets and suggest a subgroup of patients may benefit from a diverse array of molecularly targeted therapies in ACC, a rare disease in urgent need of therapeutic strategies.

cSurvival: a web resource for biomarker interactions in cancer outcomes and in cell lines

Briefings in Bioinformatics
Xuanjin Cheng, Yongxing Liu, Jiahe Wang, Yujie Chen, Andrew Gordon Robertson, Xuekui Zhang, Steven JM Jones, Stefan Taubert

Survival analysis is a technique for identifying prognostic biomarkers and genetic vulnerabilities in cancer studies. Large-scale consortium-based projects have profiled >11 000 adult and >4000 pediatric tumor cases with clinical outcomes and multiomics approaches. This provides a resource for investigating molecular-level cancer etiologies using clinical correlations. Although cancers often arise from multiple genetic vulnerabilities and have deregulated gene sets (GSs), existing survival analysis protocols can report only on individual genes. Additionally, there is no systematic method to connect clinical outcomes with experimental (cell line) data. To address these gaps, we developed cSurvival ( cSurvival provides a user-adjustable analytical pipeline with a curated, integrated database and offers three main advances: (i) joint analysis with two genomic predictors to identify interacting biomarkers, including new algorithms to identify optimal cutoffs for two continuous predictors; (ii) survival analysis not only at the gene, but also the GS level; and (iii) integration of clinical and experimental cell line studies to generate synergistic biological insights. To demonstrate these advances, we report three case studies. We confirmed findings of autophagy-dependent survival in colorectal cancers and of synergistic negative effects between high expression of SLC7A11 and SLC2A1 on outcomes in several cancers. We further used cSurvival to identify high expression of the Nrf2-antioxidant response element pathway as a main indicator for lung cancer prognosis and for cellular resistance to oxidative stress-inducing drugs. Altogether, these analyses demonstrate cSurvival's ability to support biomarker prognosis and interaction analysis via gene- and GS-level approaches and to integrate clinical and experimental biomedical studies.

The impact of whole genome and transcriptome analysis (WGTA) on predictive biomarker discovery and diagnostic accuracy of advanced malignancies

The Journal of Pathology Clinical Research
Basile Tessier-Cloutier, Jasleen K Grewal, Martin R Jones, Erin Pleasance, Yaoqing Shen, Ellen Cai, Chris Dunham, Lynn Hoang, Basil Horst, David G Huntsman, Diana Ionescu, Anthony N Karnezis, Anna F Lee, Cheng Han Lee, Tae Hoon Lee, David Dw Twa, Andrew J Mungall, Karen Mungall, Julia R Naso, Tony Ng, David F Schaeffer, Brandon S Sheffield, Brian Skinnider, Tyler Smith, Laura Williamson, Ellia Zhong, Dean A Regier, Janessa Laskin, Marco A Marra, C Blake Gilks, Steven JM Jones, Stephen Yip

In this study, we evaluate the impact of whole genome and transcriptome analysis (WGTA) on predictive molecular profiling and histologic diagnosis in a cohort of advanced malignancies. WGTA was used to generate reports including molecular alterations and site/tissue of origin prediction. Two reviewers analyzed genomic reports, clinical history, and tumor pathology. We used National Comprehensive Cancer Network (NCCN) consensus guidelines, Food and Drug Administration (FDA) approvals, and provincially reimbursed treatments to define genomic biomarkers associated with approved targeted therapeutic options (TTOs). Tumor tissue/site of origin was reassessed for most cases using genomic analysis, including a machine learning algorithm (Supervised Cancer Origin Prediction Using Expression [SCOPE]) trained on The Cancer Genome Atlas data. WGTA was performed on 652 cases, including a range of primary tumor types/tumor sites and 15 malignant tumors of uncertain histogenesis (MTUH). At the time WGTA was performed, alterations associated with an approved TTO were identified in 39 (6%) cases; 3 of these were not identified through routine pathology workup. In seven (1%) cases, the pathology workup either failed, was not performed, or gave a different result from the WGTA. Approved TTOs identified by WGTA increased to 103 (16%) when applying 2021 guidelines. The histopathologic diagnosis was reviewed in 389 cases and agreed with the diagnostic consensus after WGTA in 94% of non-MTUH cases (n = 374). The remainder included situations where the morphologic diagnosis was changed based on WGTA and clinical data (0.5%), or where the WGTA was non-contributory (5%). The 15 MTUH were all diagnosed as specific tumor types by WGTA. Tumor board reviews including WGTA agreed with almost all initial predictive molecular profile and histopathologic diagnoses. WGTA was a powerful tool to assign site/tissue of origin in MTUH. Current efforts focus on improving therapeutic predictive power and decreasing cost to enhance use of WGTA data as a routine clinical test.

Complex Autism Spectrum Disorder with Epilepsy, Strabismus and Self-Injurious Behaviors in a Patient with a De Novo Heterozygous POLR2A Variant

Daniel R Evans, Ying Qiao, Brett Trost, Kristina Calli, Sally Martell, Steven JM Jones, Stephen W Scherer, ME Suzanne Lewis

Autism spectrum disorder (ASD) describes a complex and heterogenous group of neurodevelopmental disorders. Whole genome sequencing continues to shed light on the multifactorial etiology of ASD. Dysregulated transcriptional pathways have been implicated in neurodevelopmental disorders. Emerging evidence suggests that de novo POLR2A variants cause a newly described phenotype called 'Neurodevelopmental Disorder with Hypotonia and Variable Intellectual and Behavioral Abnormalities' (NEDHIB). The variable phenotype manifests with a spectrum of features; primarily early onset hypotonia and delay in developmental milestones. In this study, we investigate a patient with complex ASD involving epilepsy and strabismus. Whole genome sequencing of the proband-parent trio uncovered a novel de novo POLR2A variant (c.1367T>C, p. Val456Ala) in the proband. The variant appears deleterious according to in silico tools. We describe the phenotype in our patient, who is now 31 years old, draw connections between the previously reported phenotypes and further delineate this emerging neurodevelopmental phenotype. This study sheds new insights into this neurodevelopmental disorder, and more broadly, the genetic etiology of ASD.

Long-read genome sequencing resolves a complex 13q structural variant associated with syndromic anophthalmia

American Journal of Medical Genetics Part A
Pierre K Boerkoel, Katherine Dixon, Carrie Fitzsimons, Yaoqing Shen, Stephanie Huynh, Kamilla Schlade-Bartusiak, Luka Culibrk, Simon Chan, Cornelius F Boerkoel, Steven JM Jones, Hui-Lin Chin

Microphthalmia, anophthalmia, and coloboma (MAC) are a heterogeneous spectrum of anomalous eye development and degeneration with genetic and environmental etiologies. Structural and copy number variants of chromosome 13 have been implicated in MAC; however, the specific loci involved in disease pathogenesis have not been well-defined. Herein we report a newborn with syndromic degenerative anophthalmia and a complex de novo rearrangement of chromosome 13q. Long-read genome sequencing improved the resolution and clinical interpretation of a duplication-triplication/inversion-duplication (DUP-TRP/INV-DUP) and terminal deletion. Sequence features at the breakpoint junctions suggested microhomology-mediated break-induced replication (MMBIR) of the maternal chromosome as the origin. Comparing this rearrangement to previously reported copy number alterations in 13q, we refine a putative dosage-sensitive critical region for MAC that might provide new insights into its molecular etiology.

The pink salmon genome: Uncovering the genomic consequences of a two-year life cycle

PLoS One
Kris A Christensen, Eric B Rondeau, Dionne Sakhrani, Carlo A Biagi, Hollie Johnson, Jay Joshi, Anne-Marie Flores, Sreeja Leelakumari, Richard Moore, Pawan K Pandoh, Ruth E Withler, Terry D Beacham, Rosalind A Leggatt, Carolyn M Tarpey, Lisa W Seeb, James E Seeb, Steven JM Jones, Robert H Devlin, Ben F Koop

Pink salmon (Oncorhynchus gorbuscha) adults are the smallest of the five Pacific salmon native to the western Pacific Ocean. Pink salmon are also the most abundant of these species and account for a large proportion of the commercial value of the salmon fishery worldwide. A two-year life history of pink salmon generates temporally isolated populations that spawn either in even-years or odd-years. To uncover the influence of this genetic isolation, reference genome assemblies were generated for each year-class and whole genome re-sequencing data was collected from salmon of both year-classes. The salmon were sampled from six Canadian rivers and one Japanese river. At multiple centromeres we identified peaks of Fst between year-classes that were millions of base-pairs long. The largest Fst peak was also associated with a million base-pair chromosomal polymorphism found in the odd-year genome near a centromere. These Fst peaks may be the result of a centromere drive or a combination of reduced recombination and genetic drift, and they could influence speciation. Other regions of the genome influenced by odd-year and even-year temporal isolation and tentatively under selection were mostly associated with genes related to immune function, organ development/maintenance, and behaviour.

Early-stage economic analysis of research-based comprehensive genomic sequencing for advanced cancer care

Journal of Community Genetics
Deirdre Weymann, Janessa Laskin, Steven JM Jones, Robyn Roscoe, Howard J Lim, Daniel J Renouf, Kasmintan A Schrader, Sophie Sun, Stephen Yip, Marco A Marra, Dean A Regier

Genomic research is driving discovery for future population beneft. Limited evidence exists on immediate patient and health system impacts of research participation. This study uses real-world data and quasi-experimental matching to examine early-stage cost and health impacts of research-based genomic sequencing. British Columbia’s Personalized OncoGenomics (POG) single-arm program applies whole genome and transcriptome analysis (WGTA) to characterize genomic landscapes in advanced cancers. Our cohort includes POG patients enrolled between 2014 and 2015 and 1:1 genetic algorithm–matched usual care controls. We undertake a cost consequence analysis and estimate 1-year efects of WGTA on patient management, patient survival, and health system costs reported in 2015 Canadian dollars. WGTA costs are imputed and forecast using system of equations modeling. We use Kaplan-Meier survival analysis to explore survival diferences and inverse probability of censoring weighted linear regression to estimate mean 1-year survival times and costs. Non-parametric bootstrapping simulates sampling distributions and enables scenario analysis, revealing drivers of incremental costs, survival, and net monetary beneft for assumed willingness to pay thresholds. We identifed 230 POG patients and 230 matched controls for cohort inclusion. The mean period cost of research-funded WGTA was $26,211 (SD: $14,191). Sequencing costs declined rapidly, with WGTA forecasts hitting $13,741 in 2021. The incremental healthcare system efect (non-research expenditures) was $5203 (95% CI: 75, 10,424) compared to usual care. No overall survival diferences were observed, but outcome heterogeneity was present. POG patients receiving WGTA-informed treatment experienced incremental survival gains of 2.49 months (95% CI: 1.32, 3.64). Future cost consequences became favorable as WGTA cost drivers declined and WGTAinformed treatment rates improved to 60%. Our study demonstrates the ability of real-world data to support evaluations of only-in-research health technologies. We identify situations where precision oncology research initiatives may produce survival beneft at a cost that is within healthcare systems’ willingness to pay. This economic evidence informs the early-stage healthcare impacts of precision oncology research.

An infant with congenital respiratory insufficiency and diaphragmatic paralysis: A novel BICD2 phenotype?

American Journal of Medical Genetics Part A
Hui-Lin Chin, Stephanie Huynh, Jahanshah Ashkani, Michael Castaldo, Katherine Dixon, Kathryn Selby, Yaoqing Shen, Marie Wright, Cornelius F Boerkoel, Glenda Hendson, Steven JM Jones

Monoallelic pathogenic variants in BICD2 are associated with autosomal dominant Spinal Muscular Atrophy Lower Extremity Predominant 2A and 2B (SMALED2A, SMALED2B). As part of the cellular vesicular transport, complex BICD2 facilitates the flow of constitutive secretory cargoes from the trans-Golgi network, and its dysfunction results in motor neuron loss. The reported phenotypes among patients with SMALED2A and SMALED2B range from a congenital onset disorder of respiratory insufficiency, arthrogryposis, and proximal or distal limb weakness to an adult-onset disorder of limb weakness and contractures. We report an infant with congenital respiratory insufficiency requiring mechanical ventilation, congenital diaphragmatic paralysis, decreased lung volume, and single finger camptodactyly. The infant displayed appropriate antigravity limb movements but had radiological, electrophysiological, and histopathological evidence of myopathy. Exome sequencing and long-read whole-genome sequencing detected a novel de novo BICD2 variant (NM_001003800.1:c.[1543G>A];[=]). This is predicted to encode p.(Glu515Lys); p.Glu515 is located in the coiled-coil 2 mutation hotspot. We hypothesize that this novel phenotype of diaphragmatic paralysis without clear appendicular muscle weakness and contractures of large joints is a presentation of BICD2-related disease.

GA4GH: International policies and standards for data sharing across genomic research and healthcare

Cell Genomics
Heidi L Rehm, et al. (including Steven JM Jones).

The Global Alliance for Genomics and Health (GA4GH) aims to accelerate biomedical advances by enabling the responsible sharing of clinical and genomic data through both harmonized data aggregation and federated approaches. The decreasing cost of genomic sequencing (along with other genome-wide molecular assays) and increasing evidence of its clinical utility will soon drive the generation of sequence data from tens of millions of humans, with increasing levels of diversity. In this perspective, we present the GA4GH strategies for addressing the major challenges of this data revolution. We describe the GA4GH organization, which is fueled by the development efforts of eight Work Streams and informed by the needs of 24 Driver Projects and other key stakeholders. We present the GA4GH suite of secure, interoperable technical standards and policy frameworks and review the current status of standards, their relevance to key domains of research and clinical care, and future plans of GA4GH. Broad international participation in building, adopting, and deploying GA4GH standards and frameworks will catalyze an unprecedented effort in data sharing that will be critical to advancing genomic medicine and ensuring that all populations can access its benefits.

Draft genome sequence of the lichenized fungus Bacidia gigantensis

Microbiology Resource Announcements
Jessica L Allen, Steven JM Jones, R Troy McMullin

The draft genome sequence of Bacidia gigantensis, a lichenized fungus in the order Lecanorales, was sequenced directly from a herbarium specimen collected from the type locality at Sleeping Giant Provincial Park in Ontario, Canada. Using long-read sequencing on the Oxford Nanopore PromethION platform, we assembled a nearly complete genome sequence.

Optimization of magnetic bead-based nucleic acid extraction for SARS-CoV-2 testing using readily available reagents

Journal of Virological Methods
Simon Haile, Aidan M Nikiforuk, Pawan K Pandoh, David D W Twa, Duane E Smailus, Jason Nguyen, Stephen Pleasance, Angus Wong, Yongjun Zhao, Diane Eisler, Michelle Moksa, Qi Cao, Marcus Wong, Edmund Su, Martin Krzywinski, Jessica Nelson, Andrew J Mungall, Frankie Tsang, Leah M Prentice, Agatha Jassem, Amee R Manges, Steven J M Jones, Robin J Coope, Natalie Prystajecky, Marco A Marra, Mel Krajden, Martin Hirst

The COVID-19 pandemic has highlighted the need for generic reagents and flexible systems in diagnostic testing. Magnetic bead-based nucleic acid extraction protocols using 96-well plates on open liquid handlers are readily amenable to meet this need. Here, one such approach is rigorously optimized to minimize cross-well contamination while maintaining sensitivity.

Contribution of Multiple Inherited Variants to Autism Spectrum Disorder (ASD) in a Family with 3 Affected Siblings

Jasleen Dhaliwal, Ying Qiao, Kristina Calli, Sally Martell, Simone Race, Chieko Chijiwa, Armansa Glodjo, Steven Jones, Evica Rajcan-Separovic, Stephen W Scherer, Suzanne Lewis

Autism Spectrum Disorder (ASD) is the most common neurodevelopmental disorder in children and shows high heritability. However, how inherited variants contribute to ASD in multiplex families remains unclear. Using whole-genome sequencing (WGS) in a family with three affected children, we identified multiple inherited DNA variants in ASD-associated genes and pathways (RELN, SHANK2, DLG1, SCN10A, KMT2C and ASH1L). All are shared among the three children, except ASH1L, which is only present in the most severely affected child. The compound heterozygous variants in RELN, and the maternally inherited variant in SHANK2, are considered to be major risk factors for ASD in this family. Both genes are involved in neuron activities, including synaptic functions and the GABAergic neurotransmission system, which are highly associated with ASD pathogenesis. DLG1 is also involved in synapse functions, and KMT2C and ASH1L are involved in chromatin organization. Our data suggest that multiple inherited rare variants, each with a subthreshold and/or variable effect, may converge to certain pathways and contribute quantitatively and additively, or alternatively act via a 2nd-hit or multiple-hits to render pathogenicity of ASD in this family. Additionally, this multiple-hits model further supports the quantitative trait hypothesis of a complex genetic, multifactorial etiology for the development of ASDs.

An approach to rapid characterization of DMD copy number variants for prenatal risk assessment

American Journal of Medical Genetics, 2021
Hui-Lin Chin, Kieran O'Neill, Kristal Louie, Lindsay Brown, Kamilla Schlade-Bartusiak, Patrice Eydoux, Rosemarie Rupps, Ali Farahani, Cornelius F Boerkoel, Steven J M Jones

Prenatal detection of structural variants of uncertain significance, including copy number variants (CNV), challenges genetic counseling, and creates ambiguity for expectant parents. In Duchenne muscular dystrophy, variant classification and phenotypic severity of CNVs are currently assessed by familial segregation, prediction of the effect on the reading frame, and precedent data. Delineation of pathogenicity by familial segregation is limited by time and suitable family members, whereas analytical tools can rapidly delineate potential consequences of variants. We identified a duplication of uncertain significance encompassing a portion of the dystrophin gene (DMD) in an unaffected mother and her male fetus. Using long-read whole genome sequencing and alignment of short reads, we rapidly defined the precise breakpoints of this variant in DMD and could provide timely counseling. The benign nature of the variant was substantiated, more slowly, by familial segregation to a healthy maternal uncle. We find long-read whole genome sequencing of clinical utility in a prenatal setting for accurate and rapid characterization of structural variants, specifically a duplication involving DMD.

Rare loss-of-function variants in type I IFN immunity genes are not associated with severe COVID-19

The Journal of Clinical Investigation
Gundula Povysil, Guillaume Butler-Laporte, Ning Shang, Chen Wang, Atlas Khan, Manal Alaamery, Tomoko Nakanishi, Sirui Zhou, Vincenzo Forgetta, Robert JM Eveleigh, Mathieu Bourgey, Naveed Aziz, Steven JM Jones, Bartha Knoppers, Stephen W Scherer, Lisa J Strug, Pierre Lepage, Jiannis Ragoussis, Guillaume Bourque, Jahad Alghamdi, Nora Aljawini, Nour Albes, Hani M Al-Afghani, Bader Alghamdi, Mansour S Almutairi, Ebrahim Sabri Mahmoud, Leen Abu-Safieh, Hadeel El Bardisy, Fawz S Al Harthi, Abdulraheem Alshareef, Bandar Ali Suliman, Saleh A Alqahtani, Abdulaziz Almalik, May M Alrashed, Salam Massadeh, Vincent Mooser, Mark Lathrop, Mohamed Fawzy, Yaseen M Arabi, Hamdi Mbarek, Chadi Saad, Wadha Al-Muftah, Junghyun Jung, Serghei Mangul, Radja Badji, Asma Al Thani, Said I Ismail, Ali G Gharavi, Malak S Abedalthagafi, J Brent Richards, David B Goldstein, Krzysztof Kiryluk

A recent report found that rare predicted loss-of-function (pLOF) variants across 13 candidate genes in TLR3- and IRF7-dependent type I IFN pathways explain up to 3.5% of severe COVID-19 cases. We performed whole-exome or whole-genome sequencing of 1,864 COVID-19 cases (713 with severe and 1,151 with mild disease) and 15,033 ancestry-matched population controls across 4 independent COVID-19 biobanks. We tested whether rare pLOF variants in these 13 genes were associated with severe COVID-19. We identified only 1 rare pLOF mutation across these genes among 713 cases with severe COVID-19 and observed no enrichment of pLOFs in severe cases compared to population controls or mild COVID-19 cases. We found no evidence of association of rare LOF variants in the 13 candidate genes with severe COVID-19 outcomes.

In vitro modeling of glioblastoma initiation using PDGF-AA and p53-null neural progenitors

Alexandra K Bohm, Jessica DePetro, Carmen E Binding, Amanda Gerber, Nicholas Chahley, N Dan Berger, Mathaeus Ware, Kaitlin Thomas, U Senapathi, Shazreh Bukhari, Cindy Chen, Erin Chahley, Cameron Grisdale, Sam Lawn, Yaping Yu, Raymond Wong, Yaoqing Shen, Hiba Omairi, Reza Mirzaei, Nourah Alshatti, Haley Pedersen, Wee Yong, Samuel Weiss, Jennifer Chan, P J Cimino, John Kelly, Steve Jones, Eric Holland, Michael Blough, Gregory Cairncross


Imagining ways to prevent or treat glioblastoma (GBM) has been hindered by a lack of understanding of its pathogenesis. Although overexpression of platelet derived growth factor with two A-chains (PDGF-AA) may be an early event, critical details of the core biology of GBM are lacking. For example, existing PDGF-driven models replicate its microscopic appearance, but not its genomic architecture. Here we report a model that overcomes this barrier to authenticity.


Using a method developed to establish neural stem cell cultures, we investigated the effects of PDGF-AA on subventricular zone (SVZ) cells, one of the putative cells of origin of GBM. We microdissected SVZ tissue from p53-null and wild-type adult mice, cultured cells in media supplemented with PDGF-AA, and assessed cell viability, proliferation, genome stability, and tumorigenicity.


Counterintuitive to its canonical role as a growth factor, we observed abrupt and massive cell death in PDGF-AA: wild-type cells did not survive, whereas a small fraction of null cells evaded apoptosis. Surviving null cells displayed attenuated proliferation accompanied by whole chromosome gains and losses. After approximately 100 days in PDGF-AA, cells suddenly proliferated rapidly, acquired growth factor independence, and became tumorigenic in immune-competent mice. Transformed cells had an oligodendrocyte precursor-like lineage marker profile, were resistant to platelet derived growth factor receptor alpha inhibition, and harbored highly abnormal karyotypes similar to human GBM.


This model associates genome instability in neural progenitor cells with chronic exposure to PDGF-AA and is the first to approximate the genomic landscape of human GBM and the first in which the earliest phases of the disease can be studied directly.

Gene Fusions Are Recurrent, Clinically Actionable Gene Rearrangements in Wild-Type Pancreatic Ductal Adenocarcinoma.

Clinical cancer research : an official journal of the American Association for Cancer Research, 2019
Jones, Martin R, Williamson, Laura M, Topham, James T, Lee, Michael K C, Goytain, Angela, Ho, Julie, Denroche, Robert E, Jang, GunHo, Pleasance, Erin, Shen, Yaoquing, Karasinska, Joanna M, McGhie, John P, Gill, Sharlene, Lim, Howard J, Moore, Malcolm J, Wong, Hui-Li, Ng, Tony, Yip, Stephen, Zhang, Wei, Sadeghi, Sara, Reisle, Carolyn, Mungall, Andrew J, Mungall, Karen L, Moore, Richard A, Ma, Yussanne, Knox, Jennifer J, Gallinger, Steven, Laskin, Janessa, Marra, Marco A, Schaeffer, David F, Jones, Steven J M, Renouf, Daniel J
Gene fusions involving neuregulin 1 () have been noted in multiple cancer types and have potential therapeutic implications. Although varying results have been reported in other cancer types, the efficacy of the HER-family kinase inhibitor afatinib in the treatment of fusion-positive pancreatic ductal adenocarcinoma is not fully understood.

Genomic characterization of a well-differentiated grade 3 pancreatic neuroendocrine tumor.

Cold Spring Harbor molecular case studies, 2019
Williamson, Laura M, Steel, Michael, Grewal, Jasleen K, Thibodeau, My Lihn, Zhao, Eric Y, Loree, Jonathan M, Yang, Kevin C, Gorski, Sharon M, Mungall, Andrew J, Mungall, Karen L, Moore, Richard A, Marra, Marco A, Laskin, Janessa, Renouf, Daniel J, Schaeffer, David F, Jones, Steven J M
Pancreatic neuroendocrine neoplasms (PanNENs) represent a minority of pancreatic neoplasms that exhibit variability in prognosis. Ongoing mutational analyses of PanNENs have found recurrent abnormalities in chromatin remodeling genes (e.g., and ), and mTOR pathway genes (e.g., , , and ), some of which have relevance to patients with related familial syndromes. Most recently, grade 3 PanNENs have been divided into two groups based on differentiation, creating a new group of well-differentiated grade 3 neuroendocrine tumors (PanNETs) that have had a limited whole-genome level characterization to date. In a patient with a metastatic well-differentiated grade 3 PanNET, our study utilized whole-genome sequencing of liver metastases for the comparative analysis and detection of single-nucleotide variants, insertions and deletions, structural variants, and copy-number variants, with their biologic relevance confirmed by RNA sequencing. We found that this tumor most notably exhibited a -disrupting fusion, showed a novel fusion, and lacked any somatic variants in , , and .

Application of a Neural Network Whole Transcriptome-Based Pan-Cancer Method for Diagnosis of Primary and Metastatic Cancers.

JAMA network open, 2019
Grewal, Jasleen K, Tessier-Cloutier, Basile, Jones, Martin, Gakkhar, Sitanshu, Ma, Yussanne, Moore, Richard, Mungall, Andrew J, Zhao, Yongjun, Taylor, Michael D, Gelmon, Karen, Lim, Howard, Renouf, Daniel, Laskin, Janessa, Marra, Marco, Yip, Stephen, Jones, Steven J M
A molecular diagnostic method that incorporates information about the transcriptional status of all genes across multiple tissue types can strengthen confidence in cancer diagnosis.

Sources of erroneous sequences and artifact chimeric reads in next generation sequencing of genomic DNA from formalin-fixed paraffin-embedded samples.

Nucleic acids research, 2019
Haile, Simon, Corbett, Richard D, Bilobram, Steve, Bye, Morgan H, Kirk, Heather, Pandoh, Pawan, Trinh, Eva, MacLeod, Tina, McDonald, Helen, Bala, Miruna, Miller, Diane, Novik, Karen, Coope, Robin J, Moore, Richard A, Zhao, Yongjun, Mungall, Andrew J, Ma, Yussanne, Holt, Rob A, Jones, Steven J, Marra, Marco A
Tissues used in pathology laboratories are typically stored in the form of formalin-fixed, paraffin-embedded (FFPE) samples. One important consideration in repurposing FFPE material for next generation sequencing (NGS) analysis is the sequencing artifacts that can arise from the significant damage to nucleic acids due to treatment with formalin, storage at room temperature and extraction. One such class of artifacts consists of chimeric reads that appear to be derived from non-contiguous portions of the genome. Here, we show that a major proportion of such chimeric reads align to both the 'Watson' and 'Crick' strands of the reference genome. We refer to these as strand-split artifact reads (SSARs). This study provides a conceptual framework for the mechanistic basis of the genesis of SSARs and other chimeric artifacts along with supporting experimental evidence, which have led to approaches to reduce the levels of such artifacts. We demonstrate that one of these approaches, involving S1 nuclease-mediated removal of single-stranded fragments and overhangs, also reduces sequence bias, base error rates, and false positive detection of copy number and single nucleotide variants. Finally, we describe an analytical approach for quantifying SSARs from NGS data.

The Genome of the Beluga Whale (Delphinapterus leucas).

Genes, 2017
Jones, Steven J M, Taylor, Gregory A, Chan, Simon, Warren, René L, Hammond, S Austin, Bilobram, Steven, Mordecai, Gideon, Suttle, Curtis A, Miller, Kristina M, Schulze, Angela, Chan, Amy M, Jones, Samantha J, Tse, Kane, Li, Irene, Cheung, Dorothy, Mungall, Karen L, Choo, Caleb, Ally, Adrian, Dhalla, Noreen, Tam, Angela K Y, Troussard, Armelle, Kirk, Heather, Pandoh, Pawan, Paulino, Daniel, Coope, Robin J N, Mungall, Andrew J, Moore, Richard, Zhao, Yongjun, Birol, Inanc, Ma, Yussanne, Marra, Marco, Haulena, Martin
The beluga whale is a cetacean that inhabits arctic and subarctic regions, and is the only living member of the genus . The genome of the beluga whale was determined using DNA sequencing approaches that employed both microfluidic partitioning library and non-partitioned library construction. The former allowed for the construction of a highly contiguous assembly with a scaffold N50 length of over 19 Mbp and total reconstruction of 2.32 Gbp. To aid our understanding of the functional elements, transcriptome data was also derived from brain, duodenum, heart, lung, spleen, and liver tissue. Assembled sequence and all of the underlying sequence data are available at the National Center for Biotechnology Information (NCBI) under the Bioproject accession number PRJNA360851A.

Assembling the 20 Gb white spruce (Picea glauca) genome from whole-genome shotgun sequencing data.

Bioinformatics (Oxford, England), 2013
Birol, Inanc, Raymond, Anthony, Jackman, Shaun D, Pleasance, Stephen, Coope, Robin, Taylor, Greg A, Yuen, Macaire Man Saint, Keeling, Christopher I, Brand, Dana, Vandervalk, Benjamin P, Kirk, Heather, Pandoh, Pawan, Moore, Richard A, Zhao, Yongjun, Mungall, Andrew J, Jaquish, Barry, Yanchuk, Alvin, Ritland, Carol, Boyle, Brian, Bousquet, Jean, Ritland, Kermit, Mackay, John, Bohlmann, Jörg, Jones, Steven J M
White spruce (Picea glauca) is a dominant conifer of the boreal forests of North America, and providing genomics resources for this commercially valuable tree will help improve forest management and conservation efforts. Sequencing and assembling the large and highly repetitive spruce genome though pushes the boundaries of the current technology. Here, we describe a whole-genome shotgun sequencing strategy using two Illumina sequencing platforms and an assembly approach using the ABySS software. We report a 20.8 giga base pairs draft genome in 4.9 million scaffolds, with a scaffold N50 of 20,356 bp. We demonstrate how recent improvements in the sequencing technology, especially increasing read lengths and paired end reads from longer fragments have a major impact on the assembly contiguity. We also note that scalable bioinformatics tools are instrumental in providing rapid draft assemblies.

Circos: an information aesthetic for comparative genomics.

Genome research, 2009
Krzywinski, Martin, Schein, Jacqueline, Birol, Inanç, Connors, Joseph, Gascoyne, Randy, Horsman, Doug, Jones, Steven J, Marra, Marco A
We created a visualization tool called Circos to facilitate the identification and analysis of similarities and differences arising from comparisons of genomes. Our tool is effective in displaying variation in genome structure and, generally, any other kind of positional relationships between genomic intervals. Such data are routinely produced by sequence alignments, hybridization arrays, genome mapping, and genotyping studies. Circos uses a circular ideogram layout to facilitate the display of relationships between pairs of positions by the use of ribbons, which encode the position, size, and orientation of related genomic elements. Circos is capable of displaying data as scatter, line, and histogram plots, heat maps, tiles, connectors, and text. Bitmap or vector images can be created from GFF-style data inputs and hierarchical configuration files, which can be easily generated by automated tools, making Circos suitable for rapid deployment in data analysis and reporting pipelines.


Dr. Jianhong An

Staff Scientist

Dr. Mikhail (Misha) Bilenky

Staff Scientist

Dr. Rohan Abraham

Research Associate

Dr. Sreeja Leelakumari

Research Associate

Dr. Kieran O'Neill

Process Development Coordinator

Sharon Ruschkowski

Bioinformatics Coordinator

Samantha Jones

Research Associate

Postdoctoral Fellows

Dr. Katherine Dixon

Postdoctoral Fellow

Dr. Jasleen Grewal

Postdoctoral Fellow


Vahid Akbari

Graduate Student

Sarah Dada

Graduate Student

Luka Culibrk

Graduate Student

Faeze Keshavarz

Graduate Student

Caralyn Reisle

Graduate Student

Jeremy Fan

Graduate Student

Glenn Chang

Graduate Student
Back to top