Jones Lab | Genome Sciences Centre

jones-group-2023

Dr. Jones' team uses bioinformatics to investigate the landscape of mutations present in cancer genomes and the early genomic events that give rise to and promote the progression of cancer. To achieve these goals, his laboratory analyzes Next Generation Sequencing data and develops novel computational approaches and methodologies. A significant aim of Dr. Jones’ research program is to find innovative ways to exploit specific genomic profiles within an individual cancer for therapeutic purposes. For example, his team has identified a number of epigenetic modifications that may potentially be targeted to reverse the effects of cancer initiating mutations. His lab is using computational approaches, such as molecular docking and molecular dynamics, to identify and refine compounds that can modify the cancer epigenome.

Location

echelon

The Jones Lab is located at Canada's Michael Smith Genome Sciences Centre, Echelon Technology Platform.

570 West 7^th Avenue
Vancouver, British Columbia
V5Z 1B3

Projects

Canada BioGenome Project

The Canada BioGenome Project is Canada's contribution to the Earth BioGenome Project—a global effort to "sequence life for the future of life". Through the Canada BioGenome Project, the GSC joins a collaborate effort to better understand and conserve our natural heritage through open access reference genomes for 400 iconic Canadian Species.

Learn more.

Learn more about Canada BioGenome Project

Cancer Bioinformatics and Genomics

Dr. Jones' team uses bioinformatics to investigate the landscape of mutations present in cancer genomes and the early genomic events that give rise to and promote the progression of cancer.

Learn more about Cancer Bioinformatics and Genomics

Selected Publications

Long-read sequencing for detection and subtyping of Prader-Willi and Angelman syndromes.

Journal of medical genetics, 2024

Akbari, Vahid, Dada, Sarah, Shen, Yaoqing, Dixon, Katherine, Hejla, Duha, Galbraith, Andrew, Choufani, Sanaa, Weksberg, Rosanna, Boerkoel, Cornelius F, Stewart, Laura, Gibson, William T, Jones, Steven J M

Prader-Willi syndrome (PWS) and Angelman syndrome (AS) are imprinting disorders caused by genetic or epigenetic aberrations of 15q11.2-q13. Their clinical testing is often multitiered; diagnostic testing begins with methylation-specific multiplex ligation-dependent probe amplification or methylation-sensitive PCR and then proceeds to molecular subtyping to determine the mechanism and recurrence risk. Currently, correct classification of a proband's PWS/AS subtype often requires parental samples, a costly process for families and health systems. The use of nanopore sequencing for molecular diagnosis of PWS and AS has been explored by Yamada ; however, to confirm heterodisomy parental data were still required. Here, we investigate genome-wide nanopore sequencing in a larger cohort of PWS (18) and AS (6) as a singular test to detect the molecular subtype, without parental data. We accurately subtyped these cases including uniparental heterodisomy, mixed iso-/heterodisomy, type 1 and 2 deletions, microdeletion and indels. One PWS case with a previously unresolved diagnosis subtyped as maternal isodisomy. This work highlights the application of long-read sequencing and other imprinted regions outside of the PWS/AS critical region to resolve the molecular diagnosis and subtyping of PWS and AS without parental data. The work also outlines an approach to generically detect heterodisomy through the interrogation of distant imprinted regions.

Long-read sequencing of an advanced cancer cohort resolves rearrangements, unravels haplotypes, and reveals methylation landscapes.

Cell genomics, 2024

O'Neill, Kieran, Pleasance, Erin, Fan, Jeremy, Akbari, Vahid, Chang, Glenn, Dixon, Katherine, Csizmok, Veronika, MacLennan, Signe, Porter, Vanessa, Galbraith, Andrew, Grisdale, Cameron J, Culibrk, Luka, Dupuis, John H, Corbett, Richard, Hopkins, James, Bowlby, Reanne, Pandoh, Pawan, Smailus, Duane E, Cheng, Dean, Wong, Tina, Frey, Connor, Shen, Yaoqing, Lewis, Eleanor, Paulin, Luis F, Sedlazeck, Fritz J, Nelson, Jessica M T, Chuah, Eric, Mungall, Karen L, Moore, Richard A, Coope, Robin, Mungall, Andrew J, McConechy, Melissa K, Williamson, Laura M, Schrader, Kasmintan A, Yip, Stephen, Marra, Marco A, Laskin, Janessa, Jones, Steven J M

The Long-Read Personalized OncoGenomics (POG) dataset comprises a cohort of 189 patient tumors and 41 matched normal samples sequenced using the Oxford Nanopore Technologies PromethION platform. This dataset from the POG program and the Marathon of Hope Cancer Centres Network includes DNA and RNA short-read sequence data, analytics, and clinical information. We show the potential of long-read sequencing for resolving complex cancer-related structural variants, viral integrations, and extrachromosomal circular DNA. Long-range phasing facilitates the discovery of allelically differentially methylated regions (aDMRs) and allele-specific expression, including recurrent aDMRs in the cancer genes RET and CDKN2A. Germline promoter methylation in MLH1 can be directly observed in Lynch syndrome. Promoter methylation in BRCA1 and RAD51C is a likely driver behind homologous recombination deficiency where no coding driver mutation was found. This dataset demonstrates applications for long-read sequencing in precision medicine and is available as a resource for developing analytical approaches using this technology.

A second update on mapping the human genetic architecture of COVID-19.

Nature, 2023

Defining the heterogeneity of unbalanced structural variation underlying breast cancer susceptibility by nanopore genome sequencing.

European journal of human genetics : EJHG, 2023

Dixon, Katherine, Shen, Yaoqing, O'Neill, Kieran, Mungall, Karen L, Chan, Simon, Bilobram, Steve, Zhang, Wei, Bezeau, Marjorie, Sharma, Alshanee, Fok, Alexandra, Mungall, Andrew J, Moore, Richard, Bosdet, Ian, Thibodeau, My Linh, Sun, Sophie, Yip, Stephen, Schrader, Kasmintan A, Jones, Steven J M

Germline structural variants (SVs) are challenging to resolve by conventional genetic testing assays. Long-read sequencing has improved the global characterization of SVs, but its sensitivity at cancer susceptibility loci has not been reported. Nanopore long-read genome sequencing was performed for nineteen individuals with pathogenic copy number alterations in BRCA1, BRCA2, CHEK2 and PALB2 identified by prior clinical testing. Fourteen variants, which spanned single exons to whole genes and included a tandem duplication, were accurately represented. Defining the precise breakpoints of SVs in BRCA1 and CHEK2 revealed unforeseen allelic heterogeneity and informed the mechanisms underlying the formation of recurrent deletions. Integrating read-based and statistical phasing further helped define extended haplotypes associated with founder alleles. Long-read sequencing is a sensitive method for characterizing private, recurrent and founder SVs underlying breast cancer susceptibility. Our findings demonstrate the potential for nanopore sequencing as a powerful genetic testing assay in the hereditary cancer setting.

The genome sequence of the Loggerhead sea turtle, Linnaeus 1758.

F1000Research, 2023

Chang, Glenn, Jones, Samantha, Leelakumari, Sreeja, Ashkani, Jahanshah, Culibrk, Luka, O'Neill, Kieran, Tse, Kane, Cheng, Dean, Chuah, Eric, McDonald, Helen, Kirk, Heather, Pandoh, Pawan, Pari, Sauro, Angelini, Valeria, Kyle, Christopher, Bertorelle, Giorgio, Zhao, Yongjun, Mungall, Andrew, Moore, Richard, Vilaça, Sibelle, Jones, Steven

We present a genome assembly of (the Loggerhead sea turtle; Chordata, Testudines, Cheloniidae), generated from genomic data from two unrelated females. The genome sequence is 2.13 gigabases in size. The assembly has a busco completion score of 96.1% and N50 of 130.95 Mb. The majority of the assembly is scaffolded into 28 chromosomal representations with a remaining 2% of the assembly being excluded from these.

Parent-of-origin detection and chromosome-scale haplotyping using long-read DNA methylation sequencing and Strand-seq.

Cell genomics, 2023

Akbari, Vahid, Hanlon, Vincent C T, O'Neill, Kieran, Lefebvre, Louis, Schrader, Kasmintan A, Lansdorp, Peter M, Jones, Steven J M

Hundreds of loci in human genomes have alleles that are methylated differentially according to their parent of origin. These imprinted loci generally show little variation across tissues, individuals, and populations. We show that such loci can be used to distinguish the maternal and paternal homologs for all human autosomes without the need for the parental DNA. We integrate methylation-detecting nanopore sequencing with the long-range phase information in Strand-seq data to determine the parent of origin of chromosome-length haplotypes for both DNA sequence and DNA methylation in five trios with diverse genetic backgrounds. The parent of origin was correctly inferred for all autosomes with an average mismatch error rate of 0.31% for SNVs and 1.89% for insertions or deletions (indels). Because our method can determine whether an inherited disease allele originated from the mother or the father, we predict that it will improve the diagnosis and management of many genetic diseases.

Genome-wide detection of imprinted differentially methylated regions using nanopore sequencing

eLife

Vahid Akbari, Jean-Michel Garant, Kieran O'Neill, Pawan Pandoh, Richard Moore, Marco A Marra, Martin Hirst, Steven JM Jones.

Imprinting is a critical part of normal embryonic development in mammals, controlled by defined parent-of-origin (PofO) differentially methylated regions (DMRs) known as imprinting control regions. Direct nanopore sequencing of DNA provides a means to detect allelic methylation and to overcome the drawbacks of methylation array and short-read technologies. Here, we used publicly available nanopore sequencing data for 12 standard B-lymphocyte cell lines to acquire the genome-wide mapping of imprinted intervals in humans. Using the sequencing data, we were able to phase 95% of the human methylome and detect 94% of the previously well-characterized, imprinted DMRs. In addition, we found 42 novel imprinted DMRs (16 germline and 26 somatic), which were confirmed using whole-genome bisulfite sequencing (WGBS) data. Analysis of WGBS data in mouse (Mus musculus), rhesus monkey (Macaca mulatta), and chimpanzee (Pan troglodytes) suggested that 17 of these imprinted DMRs are conserved. Some of the novel imprinted intervals are within or close to imprinted genes without a known DMR. We also detected subtle parental methylation bias, spanning several kilobases at seven known imprinted clusters. At these blocks, hypermethylation occurs at the gene body of expressed allele(s) with mutually exclusive H3K36me3 and H3K27me3 allelic histone marks. These results expand upon our current knowledge of imprinting and the potential of nanopore sequencing to identify imprinting regions using only parent-offspring trios, as opposed to the large multi-generational pedigrees that have previously been required.

A community approach to the cancer-variant-interpretation bottleneck

Nature Cancer

Krysiak K, Danos AM, Kiwala S, McMichael JF, Coffman AC, Barnell EK, Sheta L, Saliba J, Grisdale CJ, Kujan L, Pema S, Lever J, Spies NC, Chiorean A, Rieke DT, Clark KA, Jani P, Takahashi H, Horak P, Ritter DI, Zhou X, Ainscough BJ, Delong S, Lamping M, Marr AR, Li BV, Lin WH, Terraf P, Salama Y, Campbell KM, Farncombe KM, Ji J, Zhao X, Xu X, Kanagal-Shamanna R, Cotto KC, Skidmore ZL, Walker JR, Zhang J, Milosavljevic A, Patel RY, Giles RH, Kim RH, Schriml LM, Mardis ER, Jones SJM, Raca G, Rao S, Madhavan S, Wagner AH, Griffith OL, Griffith M.

As guidelines, therapies and literature on cancer variants expand, the lack of consensus variant interpretations impedes clinical applications. CIViC is a public-domain, crowd-sourced and adaptable knowledgebase of evidence for the clinical interpretation of variants in cancer, designed to reduce barriers to knowledge sharing and alleviate the variant-interpretation bottleneck.

Whole-genome and transcriptome analysis of advanced adrenocortical cancer highlights multiple alterations affecting epigenome and DNA repair pathways

Cold Spring Harbor Molecular Case Studies

Jean-Michel Lavoie, Veronika Csizmok, Laura M Williamson, Luka Culibrk, Gang Wang, Marco A Marra, Janessa Laskin, Steven JM Jones, Daniel J Renouf, Christian K Kollmannsberger.

Adrenocortical cancer (ACC) is a rare cancer of the adrenal gland. Several driver mutations have been identified in both primary and metastatic ACCs, but the therapeutic options are still limited. We performed whole-genome and transcriptome sequencing on seven patients with metastatic ACC. Integrative analysis of mutations, RNA expression changes, mutation signature, and homologous recombination deficiency (HRD) analysis was performed. Mutations affecting CTNNB1 and TP53 and frequent loss of heterozygosity (LOH) events were observed in our cohort. Alterations affecting genes involved in cell cycle (RB1, CDKN2A, CDKN2B), DNA repair pathways (MUTYH, BRCA2, ATM, RAD52, MLH1, MSH6), and telomere maintenance (TERF2 and TERT) consisting of somatic and germline mutations, structural variants, and expression outliers were also observed. HRDetect, which aggregates six HRD-associated mutation signatures, identified a subset of cases as HRD. Genomic alterations affecting genes involved in epigenetic regulation were also identified, including structural variants (SWI/SNF genes and histone methyltransferases), and copy gains and concurrent high expression of KDM5A, which may contribute to epigenomic deregulation. Findings from this study highlight HRD and epigenomic pathways as potential therapeutic targets and suggest a subgroup of patients may benefit from a diverse array of molecularly targeted therapies in ACC, a rare disease in urgent need of therapeutic strategies.

cSurvival: a web resource for biomarker interactions in cancer outcomes and in cell lines

Briefings in Bioinformatics

Xuanjin Cheng, Yongxing Liu, Jiahe Wang, Yujie Chen, Andrew Gordon Robertson, Xuekui Zhang, Steven JM Jones, Stefan Taubert

Survival analysis is a technique for identifying prognostic biomarkers and genetic vulnerabilities in cancer studies. Large-scale consortium-based projects have profiled >11 000 adult and >4000 pediatric tumor cases with clinical outcomes and multiomics approaches. This provides a resource for investigating molecular-level cancer etiologies using clinical correlations. Although cancers often arise from multiple genetic vulnerabilities and have deregulated gene sets (GSs), existing survival analysis protocols can report only on individual genes. Additionally, there is no systematic method to connect clinical outcomes with experimental (cell line) data. To address these gaps, we developed cSurvival (https://tau.cmmt.ubc.ca/cSurvival). cSurvival provides a user-adjustable analytical pipeline with a curated, integrated database and offers three main advances: (i) joint analysis with two genomic predictors to identify interacting biomarkers, including new algorithms to identify optimal cutoffs for two continuous predictors; (ii) survival analysis not only at the gene, but also the GS level; and (iii) integration of clinical and experimental cell line studies to generate synergistic biological insights. To demonstrate these advances, we report three case studies. We confirmed findings of autophagy-dependent survival in colorectal cancers and of synergistic negative effects between high expression of SLC7A11 and SLC2A1 on outcomes in several cancers. We further used cSurvival to identify high expression of the Nrf2-antioxidant response element pathway as a main indicator for lung cancer prognosis and for cellular resistance to oxidative stress-inducing drugs. Altogether, these analyses demonstrate cSurvival's ability to support biomarker prognosis and interaction analysis via gene- and GS-level approaches and to integrate clinical and experimental biomedical studies.

The impact of whole genome and transcriptome analysis (WGTA) on predictive biomarker discovery and diagnostic accuracy of advanced malignancies

The Journal of Pathology Clinical Research

Basile Tessier-Cloutier, Jasleen K Grewal, Martin R Jones, Erin Pleasance, Yaoqing Shen, Ellen Cai, Chris Dunham, Lynn Hoang, Basil Horst, David G Huntsman, Diana Ionescu, Anthony N Karnezis, Anna F Lee, Cheng Han Lee, Tae Hoon Lee, David Dw Twa, Andrew J Mungall, Karen Mungall, Julia R Naso, Tony Ng, David F Schaeffer, Brandon S Sheffield, Brian Skinnider, Tyler Smith, Laura Williamson, Ellia Zhong, Dean A Regier, Janessa Laskin, Marco A Marra, C Blake Gilks, Steven JM Jones, Stephen Yip

In this study, we evaluate the impact of whole genome and transcriptome analysis (WGTA) on predictive molecular profiling and histologic diagnosis in a cohort of advanced malignancies. WGTA was used to generate reports including molecular alterations and site/tissue of origin prediction. Two reviewers analyzed genomic reports, clinical history, and tumor pathology. We used National Comprehensive Cancer Network (NCCN) consensus guidelines, Food and Drug Administration (FDA) approvals, and provincially reimbursed treatments to define genomic biomarkers associated with approved targeted therapeutic options (TTOs). Tumor tissue/site of origin was reassessed for most cases using genomic analysis, including a machine learning algorithm (Supervised Cancer Origin Prediction Using Expression [SCOPE]) trained on The Cancer Genome Atlas data. WGTA was performed on 652 cases, including a range of primary tumor types/tumor sites and 15 malignant tumors of uncertain histogenesis (MTUH). At the time WGTA was performed, alterations associated with an approved TTO were identified in 39 (6%) cases; 3 of these were not identified through routine pathology workup. In seven (1%) cases, the pathology workup either failed, was not performed, or gave a different result from the WGTA. Approved TTOs identified by WGTA increased to 103 (16%) when applying 2021 guidelines. The histopathologic diagnosis was reviewed in 389 cases and agreed with the diagnostic consensus after WGTA in 94% of non-MTUH cases (n = 374). The remainder included situations where the morphologic diagnosis was changed based on WGTA and clinical data (0.5%), or where the WGTA was non-contributory (5%). The 15 MTUH were all diagnosed as specific tumor types by WGTA. Tumor board reviews including WGTA agreed with almost all initial predictive molecular profile and histopathologic diagnoses. WGTA was a powerful tool to assign site/tissue of origin in MTUH. Current efforts focus on improving therapeutic predictive power and decreasing cost to enhance use of WGTA data as a routine clinical test.

Complex Autism Spectrum Disorder with Epilepsy, Strabismus and Self-Injurious Behaviors in a Patient with a De Novo Heterozygous POLR2A Variant

Genes

Daniel R Evans, Ying Qiao, Brett Trost, Kristina Calli, Sally Martell, Steven JM Jones, Stephen W Scherer, ME Suzanne Lewis

Autism spectrum disorder (ASD) describes a complex and heterogenous group of neurodevelopmental disorders. Whole genome sequencing continues to shed light on the multifactorial etiology of ASD. Dysregulated transcriptional pathways have been implicated in neurodevelopmental disorders. Emerging evidence suggests that de novo POLR2A variants cause a newly described phenotype called 'Neurodevelopmental Disorder with Hypotonia and Variable Intellectual and Behavioral Abnormalities' (NEDHIB). The variable phenotype manifests with a spectrum of features; primarily early onset hypotonia and delay in developmental milestones. In this study, we investigate a patient with complex ASD involving epilepsy and strabismus. Whole genome sequencing of the proband-parent trio uncovered a novel de novo POLR2A variant (c.1367T>C, p. Val456Ala) in the proband. The variant appears deleterious according to in silico tools. We describe the phenotype in our patient, who is now 31 years old, draw connections between the previously reported phenotypes and further delineate this emerging neurodevelopmental phenotype. This study sheds new insights into this neurodevelopmental disorder, and more broadly, the genetic etiology of ASD.

Long-read genome sequencing resolves a complex 13q structural variant associated with syndromic anophthalmia

American Journal of Medical Genetics Part A

Pierre K Boerkoel, Katherine Dixon, Carrie Fitzsimons, Yaoqing Shen, Stephanie Huynh, Kamilla Schlade-Bartusiak, Luka Culibrk, Simon Chan, Cornelius F Boerkoel, Steven JM Jones, Hui-Lin Chin

Microphthalmia, anophthalmia, and coloboma (MAC) are a heterogeneous spectrum of anomalous eye development and degeneration with genetic and environmental etiologies. Structural and copy number variants of chromosome 13 have been implicated in MAC; however, the specific loci involved in disease pathogenesis have not been well-defined. Herein we report a newborn with syndromic degenerative anophthalmia and a complex de novo rearrangement of chromosome 13q. Long-read genome sequencing improved the resolution and clinical interpretation of a duplication-triplication/inversion-duplication (DUP-TRP/INV-DUP) and terminal deletion. Sequence features at the breakpoint junctions suggested microhomology-mediated break-induced replication (MMBIR) of the maternal chromosome as the origin. Comparing this rearrangement to previously reported copy number alterations in 13q, we refine a putative dosage-sensitive critical region for MAC that might provide new insights into its molecular etiology.

The Earth BioGenome Project 2020: Starting the clock

Proceedings of the National Academy of Sciences of the United States of America

Harris A Lewin, et al., (including Steven JM Jones)

The pink salmon genome: Uncovering the genomic consequences of a two-year life cycle

PLoS One

Kris A Christensen, Eric B Rondeau, Dionne Sakhrani, Carlo A Biagi, Hollie Johnson, Jay Joshi, Anne-Marie Flores, Sreeja Leelakumari, Richard Moore, Pawan K Pandoh, Ruth E Withler, Terry D Beacham, Rosalind A Leggatt, Carolyn M Tarpey, Lisa W Seeb, James E Seeb, Steven JM Jones, Robert H Devlin, Ben F Koop

Pink salmon (Oncorhynchus gorbuscha) adults are the smallest of the five Pacific salmon native to the western Pacific Ocean. Pink salmon are also the most abundant of these species and account for a large proportion of the commercial value of the salmon fishery worldwide. A two-year life history of pink salmon generates temporally isolated populations that spawn either in even-years or odd-years. To uncover the influence of this genetic isolation, reference genome assemblies were generated for each year-class and whole genome re-sequencing data was collected from salmon of both year-classes. The salmon were sampled from six Canadian rivers and one Japanese river. At multiple centromeres we identified peaks of Fst between year-classes that were millions of base-pairs long. The largest Fst peak was also associated with a million base-pair chromosomal polymorphism found in the odd-year genome near a centromere. These Fst peaks may be the result of a centromere drive or a combination of reduced recombination and genetic drift, and they could influence speciation. Other regions of the genome influenced by odd-year and even-year temporal isolation and tentatively under selection were mostly associated with genes related to immune function, organ development/maintenance, and behaviour.

Early-stage economic analysis of research-based comprehensive genomic sequencing for advanced cancer care

Journal of Community Genetics

Deirdre Weymann, Janessa Laskin, Steven JM Jones, Robyn Roscoe, Howard J Lim, Daniel J Renouf, Kasmintan A Schrader, Sophie Sun, Stephen Yip, Marco A Marra, Dean A Regier

Genomic research is driving discovery for future population beneft. Limited evidence exists on immediate patient and health system impacts of research participation. This study uses real-world data and quasi-experimental matching to examine early-stage cost and health impacts of research-based genomic sequencing. British Columbia’s Personalized OncoGenomics (POG) single-arm program applies whole genome and transcriptome analysis (WGTA) to characterize genomic landscapes in advanced cancers. Our cohort includes POG patients enrolled between 2014 and 2015 and 1:1 genetic algorithm–matched usual care controls. We undertake a cost consequence analysis and estimate 1-year efects of WGTA on patient management, patient survival, and health system costs reported in 2015 Canadian dollars. WGTA costs are imputed and forecast using system of equations modeling. We use Kaplan-Meier survival analysis to explore survival diferences and inverse probability of censoring weighted linear regression to estimate mean 1-year survival times and costs. Non-parametric bootstrapping simulates sampling distributions and enables scenario analysis, revealing drivers of incremental costs, survival, and net monetary beneft for assumed willingness to pay thresholds. We identifed 230 POG patients and 230 matched controls for cohort inclusion. The mean period cost of research-funded WGTA was $26,211 (SD: $14,191). Sequencing costs declined rapidly, with WGTA forecasts hitting $13,741 in 2021. The incremental healthcare system efect (non-research expenditures) was $5203 (95% CI: 75, 10,424) compared to usual care. No overall survival diferences were observed, but outcome heterogeneity was present. POG patients receiving WGTA-informed treatment experienced incremental survival gains of 2.49 months (95% CI: 1.32, 3.64). Future cost consequences became favorable as WGTA cost drivers declined and WGTAinformed treatment rates improved to 60%. Our study demonstrates the ability of real-world data to support evaluations of only-in-research health technologies. We identify situations where precision oncology research initiatives may produce survival beneft at a cost that is within healthcare systems’ willingness to pay. This economic evidence informs the early-stage healthcare impacts of precision oncology research.

An infant with congenital respiratory insufficiency and diaphragmatic paralysis: A novel BICD2 phenotype?

American Journal of Medical Genetics Part A

Hui-Lin Chin, Stephanie Huynh, Jahanshah Ashkani, Michael Castaldo, Katherine Dixon, Kathryn Selby, Yaoqing Shen, Marie Wright, Cornelius F Boerkoel, Glenda Hendson, Steven JM Jones

Monoallelic pathogenic variants in BICD2 are associated with autosomal dominant Spinal Muscular Atrophy Lower Extremity Predominant 2A and 2B (SMALED2A, SMALED2B). As part of the cellular vesicular transport, complex BICD2 facilitates the flow of constitutive secretory cargoes from the trans-Golgi network, and its dysfunction results in motor neuron loss. The reported phenotypes among patients with SMALED2A and SMALED2B range from a congenital onset disorder of respiratory insufficiency, arthrogryposis, and proximal or distal limb weakness to an adult-onset disorder of limb weakness and contractures. We report an infant with congenital respiratory insufficiency requiring mechanical ventilation, congenital diaphragmatic paralysis, decreased lung volume, and single finger camptodactyly. The infant displayed appropriate antigravity limb movements but had radiological, electrophysiological, and histopathological evidence of myopathy. Exome sequencing and long-read whole-genome sequencing detected a novel de novo BICD2 variant (NM_001003800.1:c.[1543G>A];[=]). This is predicted to encode p.(Glu515Lys); p.Glu515 is located in the coiled-coil 2 mutation hotspot. We hypothesize that this novel phenotype of diaphragmatic paralysis without clear appendicular muscle weakness and contractures of large joints is a presentation of BICD2-related disease.

GA4GH: International policies and standards for data sharing across genomic research and healthcare

Cell Genomics

Heidi L Rehm, et al. (including Steven JM Jones).

The Global Alliance for Genomics and Health (GA4GH) aims to accelerate biomedical advances by enabling the responsible sharing of clinical and genomic data through both harmonized data aggregation and federated approaches. The decreasing cost of genomic sequencing (along with other genome-wide molecular assays) and increasing evidence of its clinical utility will soon drive the generation of sequence data from tens of millions of humans, with increasing levels of diversity. In this perspective, we present the GA4GH strategies for addressing the major challenges of this data revolution. We describe the GA4GH organization, which is fueled by the development efforts of eight Work Streams and informed by the needs of 24 Driver Projects and other key stakeholders. We present the GA4GH suite of secure, interoperable technical standards and policy frameworks and review the current status of standards, their relevance to key domains of research and clinical care, and future plans of GA4GH. Broad international participation in building, adopting, and deploying GA4GH standards and frameworks will catalyze an unprecedented effort in data sharing that will be critical to advancing genomic medicine and ensuring that all populations can access its benefits.

Draft genome sequence of the lichenized fungus Bacidia gigantensis

Microbiology Resource Announcements

Jessica L Allen, Steven JM Jones, R Troy McMullin

The draft genome sequence of Bacidia gigantensis, a lichenized fungus in the order Lecanorales, was sequenced directly from a herbarium specimen collected from the type locality at Sleeping Giant Provincial Park in Ontario, Canada. Using long-read sequencing on the Oxford Nanopore PromethION platform, we assembled a nearly complete genome sequence.

Optimization of magnetic bead-based nucleic acid extraction for SARS-CoV-2 testing using readily available reagents

Journal of Virological Methods

Simon Haile, Aidan M Nikiforuk, Pawan K Pandoh, David D W Twa, Duane E Smailus, Jason Nguyen, Stephen Pleasance, Angus Wong, Yongjun Zhao, Diane Eisler, Michelle Moksa, Qi Cao, Marcus Wong, Edmund Su, Martin Krzywinski, Jessica Nelson, Andrew J Mungall, Frankie Tsang, Leah M Prentice, Agatha Jassem, Amee R Manges, Steven J M Jones, Robin J Coope, Natalie Prystajecky, Marco A Marra, Mel Krajden, Martin Hirst

The COVID-19 pandemic has highlighted the need for generic reagents and flexible systems in diagnostic testing. Magnetic bead-based nucleic acid extraction protocols using 96-well plates on open liquid handlers are readily amenable to meet this need. Here, one such approach is rigorously optimized to minimize cross-well contamination while maintaining sensitivity.

Contribution of Multiple Inherited Variants to Autism Spectrum Disorder (ASD) in a Family with 3 Affected Siblings

Genes

Jasleen Dhaliwal, Ying Qiao, Kristina Calli, Sally Martell, Simone Race, Chieko Chijiwa, Armansa Glodjo, Steven Jones, Evica Rajcan-Separovic, Stephen W Scherer, Suzanne Lewis

Autism Spectrum Disorder (ASD) is the most common neurodevelopmental disorder in children and shows high heritability. However, how inherited variants contribute to ASD in multiplex families remains unclear. Using whole-genome sequencing (WGS) in a family with three affected children, we identified multiple inherited DNA variants in ASD-associated genes and pathways (RELN, SHANK2, DLG1, SCN10A, KMT2C and ASH1L). All are shared among the three children, except ASH1L, which is only present in the most severely affected child. The compound heterozygous variants in RELN, and the maternally inherited variant in SHANK2, are considered to be major risk factors for ASD in this family. Both genes are involved in neuron activities, including synaptic functions and the GABAergic neurotransmission system, which are highly associated with ASD pathogenesis. DLG1 is also involved in synapse functions, and KMT2C and ASH1L are involved in chromatin organization. Our data suggest that multiple inherited rare variants, each with a subthreshold and/or variable effect, may converge to certain pathways and contribute quantitatively and additively, or alternatively act via a 2nd-hit or multiple-hits to render pathogenicity of ASD in this family. Additionally, this multiple-hits model further supports the quantitative trait hypothesis of a complex genetic, multifactorial etiology for the development of ASDs.

An approach to rapid characterization of DMD copy number variants for prenatal risk assessment

American Journal of Medical Genetics, 2021

Hui-Lin Chin, Kieran O'Neill, Kristal Louie, Lindsay Brown, Kamilla Schlade-Bartusiak, Patrice Eydoux, Rosemarie Rupps, Ali Farahani, Cornelius F Boerkoel, Steven J M Jones

Prenatal detection of structural variants of uncertain significance, including copy number variants (CNV), challenges genetic counseling, and creates ambiguity for expectant parents. In Duchenne muscular dystrophy, variant classification and phenotypic severity of CNVs are currently assessed by familial segregation, prediction of the effect on the reading frame, and precedent data. Delineation of pathogenicity by familial segregation is limited by time and suitable family members, whereas analytical tools can rapidly delineate potential consequences of variants. We identified a duplication of uncertain significance encompassing a portion of the dystrophin gene (DMD) in an unaffected mother and her male fetus. Using long-read whole genome sequencing and alignment of short reads, we rapidly defined the precise breakpoints of this variant in DMD and could provide timely counseling. The benign nature of the variant was substantiated, more slowly, by familial segregation to a healthy maternal uncle. We find long-read whole genome sequencing of clinical utility in a prenatal setting for accurate and rapid characterization of structural variants, specifically a duplication involving DMD.

Rare loss-of-function variants in type I IFN immunity genes are not associated with severe COVID-19

The Journal of Clinical Investigation

Gundula Povysil, Guillaume Butler-Laporte, Ning Shang, Chen Wang, Atlas Khan, Manal Alaamery, Tomoko Nakanishi, Sirui Zhou, Vincenzo Forgetta, Robert JM Eveleigh, Mathieu Bourgey, Naveed Aziz, Steven JM Jones, Bartha Knoppers, Stephen W Scherer, Lisa J Strug, Pierre Lepage, Jiannis Ragoussis, Guillaume Bourque, Jahad Alghamdi, Nora Aljawini, Nour Albes, Hani M Al-Afghani, Bader Alghamdi, Mansour S Almutairi, Ebrahim Sabri Mahmoud, Leen Abu-Safieh, Hadeel El Bardisy, Fawz S Al Harthi, Abdulraheem Alshareef, Bandar Ali Suliman, Saleh A Alqahtani, Abdulaziz Almalik, May M Alrashed, Salam Massadeh, Vincent Mooser, Mark Lathrop, Mohamed Fawzy, Yaseen M Arabi, Hamdi Mbarek, Chadi Saad, Wadha Al-Muftah, Junghyun Jung, Serghei Mangul, Radja Badji, Asma Al Thani, Said I Ismail, Ali G Gharavi, Malak S Abedalthagafi, J Brent Richards, David B Goldstein, Krzysztof Kiryluk

A recent report found that rare predicted loss-of-function (pLOF) variants across 13 candidate genes in TLR3- and IRF7-dependent type I IFN pathways explain up to 3.5% of severe COVID-19 cases. We performed whole-exome or whole-genome sequencing of 1,864 COVID-19 cases (713 with severe and 1,151 with mild disease) and 15,033 ancestry-matched population controls across 4 independent COVID-19 biobanks. We tested whether rare pLOF variants in these 13 genes were associated with severe COVID-19. We identified only 1 rare pLOF mutation across these genes among 713 cases with severe COVID-19 and observed no enrichment of pLOFs in severe cases compared to population controls or mild COVID-19 cases. We found no evidence of association of rare LOF variants in the 13 candidate genes with severe COVID-19 outcomes.

In vitro modeling of glioblastoma initiation using PDGF-AA and p53-null neural progenitors

Neuro-Oncology

Alexandra K Bohm, Jessica DePetro, Carmen E Binding, Amanda Gerber, Nicholas Chahley, N Dan Berger, Mathaeus Ware, Kaitlin Thomas, U Senapathi, Shazreh Bukhari, Cindy Chen, Erin Chahley, Cameron Grisdale, Sam Lawn, Yaping Yu, Raymond Wong, Yaoqing Shen, Hiba Omairi, Reza Mirzaei, Nourah Alshatti, Haley Pedersen, Wee Yong, Samuel Weiss, Jennifer Chan, P J Cimino, John Kelly, Steve Jones, Eric Holland, Michael Blough, Gregory Cairncross

Background

Imagining ways to prevent or treat glioblastoma (GBM) has been hindered by a lack of understanding of its pathogenesis. Although overexpression of platelet derived growth factor with two A-chains (PDGF-AA) may be an early event, critical details of the core biology of GBM are lacking. For example, existing PDGF-driven models replicate its microscopic appearance, but not its genomic architecture. Here we report a model that overcomes this barrier to authenticity.

Methods

Using a method developed to establish neural stem cell cultures, we investigated the effects of PDGF-AA on subventricular zone (SVZ) cells, one of the putative cells of origin of GBM. We microdissected SVZ tissue from p53-null and wild-type adult mice, cultured cells in media supplemented with PDGF-AA, and assessed cell viability, proliferation, genome stability, and tumorigenicity.

Results

Counterintuitive to its canonical role as a growth factor, we observed abrupt and massive cell death in PDGF-AA: wild-type cells did not survive, whereas a small fraction of null cells evaded apoptosis. Surviving null cells displayed attenuated proliferation accompanied by whole chromosome gains and losses. After approximately 100 days in PDGF-AA, cells suddenly proliferated rapidly, acquired growth factor independence, and became tumorigenic in immune-competent mice. Transformed cells had an oligodendrocyte precursor-like lineage marker profile, were resistant to platelet derived growth factor receptor alpha inhibition, and harbored highly abnormal karyotypes similar to human GBM.

Conclusion

This model associates genome instability in neural progenitor cells with chronic exposure to PDGF-AA and is the first to approximate the genomic landscape of human GBM and the first in which the earliest phases of the disease can be studied directly.

Comprehensive genomic profiling of glioblastoma tumors, BTICs, and xenografts reveals stability and adaptation to growth environments.

Proceedings of the National Academy of Sciences of the United States of America, 2019

Shen, Yaoqing, Grisdale, Cameron J, Islam, Sumaiya A, Bose, Pinaki, Lever, Jake, Zhao, Eric Y, Grinshtein, Natalie, Ma, Yussanne, Mungall, Andrew J, Moore, Richard A, Lun, Xueqing, Senger, Donna L, Robbins, Stephen M, Wang, Alice Yijun, MacIsaac, Julia L, Kobor, Michael S, Luchman, H Artee, Weiss, Samuel, Chan, Jennifer A, Blough, Michael D, Kaplan, David R, Cairncross, J Gregory, Marra, Marco A, Jones, Steven J M

Glioblastoma multiforme (GBM) is the most deadly brain tumor, and currently lacks effective treatment options. Brain tumor-initiating cells (BTICs) and orthotopic xenografts are widely used in investigating GBM biology and new therapies for this aggressive disease. However, the genomic characteristics and molecular resemblance of these models to GBM tumors remain undetermined. We used massively parallel sequencing technology to decode the genomes and transcriptomes of BTICs and xenografts and their matched tumors in order to delineate the potential impacts of the distinct growth environments. Using data generated from whole-genome sequencing of 201 samples and RNA sequencing of 118 samples, we show that BTICs and xenografts resemble their parental tumor at the genomic level but differ at the mRNA expression and epigenomic levels, likely due to the different growth environment for each sample type. These findings suggest that a comprehensive genomic understanding of in vitro and in vivo GBM model systems is crucial for interpreting data from drug screens, and can help control for biases introduced by cell-culture conditions and the microenvironment in mouse models. We also found that lack of expression in pretreated GBM is linked to hypermutation, which in turn contributes to increased genomic heterogeneity and requires new strategies for GBM treatment.

Gene Fusions Are Recurrent, Clinically Actionable Gene Rearrangements in Wild-Type Pancreatic Ductal Adenocarcinoma.

Clinical cancer research : an official journal of the American Association for Cancer Research, 2019

Jones, Martin R, Williamson, Laura M, Topham, James T, Lee, Michael K C, Goytain, Angela, Ho, Julie, Denroche, Robert E, Jang, GunHo, Pleasance, Erin, Shen, Yaoquing, Karasinska, Joanna M, McGhie, John P, Gill, Sharlene, Lim, Howard J, Moore, Malcolm J, Wong, Hui-Li, Ng, Tony, Yip, Stephen, Zhang, Wei, Sadeghi, Sara, Reisle, Carolyn, Mungall, Andrew J, Mungall, Karen L, Moore, Richard A, Ma, Yussanne, Knox, Jennifer J, Gallinger, Steven, Laskin, Janessa, Marra, Marco A, Schaeffer, David F, Jones, Steven J M, Renouf, Daniel J

Gene fusions involving neuregulin 1 () have been noted in multiple cancer types and have potential therapeutic implications. Although varying results have been reported in other cancer types, the efficacy of the HER-family kinase inhibitor afatinib in the treatment of fusion-positive pancreatic ductal adenocarcinoma is not fully understood.

Genomic characterization of a well-differentiated grade 3 pancreatic neuroendocrine tumor.

Cold Spring Harbor molecular case studies, 2019

Williamson, Laura M, Steel, Michael, Grewal, Jasleen K, Thibodeau, My Lihn, Zhao, Eric Y, Loree, Jonathan M, Yang, Kevin C, Gorski, Sharon M, Mungall, Andrew J, Mungall, Karen L, Moore, Richard A, Marra, Marco A, Laskin, Janessa, Renouf, Daniel J, Schaeffer, David F, Jones, Steven J M

Pancreatic neuroendocrine neoplasms (PanNENs) represent a minority of pancreatic neoplasms that exhibit variability in prognosis. Ongoing mutational analyses of PanNENs have found recurrent abnormalities in chromatin remodeling genes (e.g., and ), and mTOR pathway genes (e.g., , , and ), some of which have relevance to patients with related familial syndromes. Most recently, grade 3 PanNENs have been divided into two groups based on differentiation, creating a new group of well-differentiated grade 3 neuroendocrine tumors (PanNETs) that have had a limited whole-genome level characterization to date. In a patient with a metastatic well-differentiated grade 3 PanNET, our study utilized whole-genome sequencing of liver metastases for the comparative analysis and detection of single-nucleotide variants, insertions and deletions, structural variants, and copy-number variants, with their biologic relevance confirmed by RNA sequencing. We found that this tumor most notably exhibited a -disrupting fusion, showed a novel fusion, and lacked any somatic variants in , , and .

Application of a Neural Network Whole Transcriptome-Based Pan-Cancer Method for Diagnosis of Primary and Metastatic Cancers.

JAMA network open, 2019

Grewal, Jasleen K, Tessier-Cloutier, Basile, Jones, Martin, Gakkhar, Sitanshu, Ma, Yussanne, Moore, Richard, Mungall, Andrew J, Zhao, Yongjun, Taylor, Michael D, Gelmon, Karen, Lim, Howard, Renouf, Daniel, Laskin, Janessa, Marra, Marco, Yip, Stephen, Jones, Steven J M

A molecular diagnostic method that incorporates information about the transcriptional status of all genes across multiple tissue types can strengthen confidence in cancer diagnosis.

Sources of erroneous sequences and artifact chimeric reads in next generation sequencing of genomic DNA from formalin-fixed paraffin-embedded samples.

Nucleic acids research, 2019

Haile, Simon, Corbett, Richard D, Bilobram, Steve, Bye, Morgan H, Kirk, Heather, Pandoh, Pawan, Trinh, Eva, MacLeod, Tina, McDonald, Helen, Bala, Miruna, Miller, Diane, Novik, Karen, Coope, Robin J, Moore, Richard A, Zhao, Yongjun, Mungall, Andrew J, Ma, Yussanne, Holt, Rob A, Jones, Steven J, Marra, Marco A

Tissues used in pathology laboratories are typically stored in the form of formalin-fixed, paraffin-embedded (FFPE) samples. One important consideration in repurposing FFPE material for next generation sequencing (NGS) analysis is the sequencing artifacts that can arise from the significant damage to nucleic acids due to treatment with formalin, storage at room temperature and extraction. One such class of artifacts consists of chimeric reads that appear to be derived from non-contiguous portions of the genome. Here, we show that a major proportion of such chimeric reads align to both the 'Watson' and 'Crick' strands of the reference genome. We refer to these as strand-split artifact reads (SSARs). This study provides a conceptual framework for the mechanistic basis of the genesis of SSARs and other chimeric artifacts along with supporting experimental evidence, which have led to approaches to reduce the levels of such artifacts. We demonstrate that one of these approaches, involving S1 nuclease-mediated removal of single-stranded fragments and overhangs, also reduces sequence bias, base error rates, and false positive detection of copy number and single nucleotide variants. Finally, we describe an analytical approach for quantifying SSARs from NGS data.

The Genome of the Beluga Whale (Delphinapterus leucas).

Genes, 2017

Jones, Steven J M, Taylor, Gregory A, Chan, Simon, Warren, René L, Hammond, S Austin, Bilobram, Steven, Mordecai, Gideon, Suttle, Curtis A, Miller, Kristina M, Schulze, Angela, Chan, Amy M, Jones, Samantha J, Tse, Kane, Li, Irene, Cheung, Dorothy, Mungall, Karen L, Choo, Caleb, Ally, Adrian, Dhalla, Noreen, Tam, Angela K Y, Troussard, Armelle, Kirk, Heather, Pandoh, Pawan, Paulino, Daniel, Coope, Robin J N, Mungall, Andrew J, Moore, Richard, Zhao, Yongjun, Birol, Inanc, Ma, Yussanne, Marra, Marco, Haulena, Martin

The beluga whale is a cetacean that inhabits arctic and subarctic regions, and is the only living member of the genus . The genome of the beluga whale was determined using DNA sequencing approaches that employed both microfluidic partitioning library and non-partitioned library construction. The former allowed for the construction of a highly contiguous assembly with a scaffold N50 length of over 19 Mbp and total reconstruction of 2.32 Gbp. To aid our understanding of the functional elements, transcriptome data was also derived from brain, duodenum, heart, lung, spleen, and liver tissue. Assembled sequence and all of the underlying sequence data are available at the National Center for Biotechnology Information (NCBI) under the Bioproject accession number PRJNA360851A.

Assembling the 20 Gb white spruce (Picea glauca) genome from whole-genome shotgun sequencing data.

Bioinformatics (Oxford, England), 2013

Birol, Inanc, Raymond, Anthony, Jackman, Shaun D, Pleasance, Stephen, Coope, Robin, Taylor, Greg A, Yuen, Macaire Man Saint, Keeling, Christopher I, Brand, Dana, Vandervalk, Benjamin P, Kirk, Heather, Pandoh, Pawan, Moore, Richard A, Zhao, Yongjun, Mungall, Andrew J, Jaquish, Barry, Yanchuk, Alvin, Ritland, Carol, Boyle, Brian, Bousquet, Jean, Ritland, Kermit, Mackay, John, Bohlmann, Jörg, Jones, Steven J M

White spruce (Picea glauca) is a dominant conifer of the boreal forests of North America, and providing genomics resources for this commercially valuable tree will help improve forest management and conservation efforts. Sequencing and assembling the large and highly repetitive spruce genome though pushes the boundaries of the current technology. Here, we describe a whole-genome shotgun sequencing strategy using two Illumina sequencing platforms and an assembly approach using the ABySS software. We report a 20.8 giga base pairs draft genome in 4.9 million scaffolds, with a scaffold N50 of 20,356 bp. We demonstrate how recent improvements in the sequencing technology, especially increasing read lengths and paired end reads from longer fragments have a major impact on the assembly contiguity. We also note that scalable bioinformatics tools are instrumental in providing rapid draft assemblies.

Evolution of an adenocarcinoma in response to selection by targeted kinase inhibitors.

Genome biology, 2010

Jones, Steven Jm, Laskin, Janessa, Li, Yvonne Y, Griffith, Obi L, An, Jianghong, Bilenky, Mikhail, Butterfield, Yaron S, Cezard, Timothee, Chuah, Eric, Corbett, Richard, Fejes, Anthony P, Griffith, Malachi, Yee, John, Martin, Montgomery, Mayo, Michael, Melnyk, Nataliya, Morin, Ryan D, Pugh, Trevor J, Severson, Tesa, Shah, Sohrab P, Sutcliffe, Margaret, Tam, Angela, Terry, Jefferson, Thiessen, Nina, Thomson, Thomas, Varhol, Richard, Zeng, Thomas, Zhao, Yongjun, Moore, Richard A, Huntsman, David G, Birol, Inanc, Hirst, Martin, Holt, Robert A, Marra, Marco A

Adenocarcinomas of the tongue are rare and represent the minority (20 to 25%) of salivary gland tumors affecting the tongue. We investigated the utility of massively parallel sequencing to characterize an adenocarcinoma of the tongue, before and after treatment.

Circos: an information aesthetic for comparative genomics.

Genome research, 2009

Krzywinski, Martin, Schein, Jacqueline, Birol, Inanç, Connors, Joseph, Gascoyne, Randy, Horsman, Doug, Jones, Steven J, Marra, Marco A

We created a visualization tool called Circos to facilitate the identification and analysis of similarities and differences arising from comparisons of genomes. Our tool is effective in displaying variation in genome structure and, generally, any other kind of positional relationships between genomic intervals. Such data are routinely produced by sequence alignments, hybridization arrays, genome mapping, and genotyping studies. Circos uses a circular ideogram layout to facilitate the display of relationships between pairs of positions by the use of ribbons, which encode the position, size, and orientation of related genomic elements. Circos is capable of displaying data as scatter, line, and histogram plots, heat maps, tiles, connectors, and text. Bitmap or vector images can be created from GFF-style data inputs and hierarchical configuration files, which can be easily generated by automated tools, making Circos suitable for rapid deployment in data analysis and reporting pipelines.

Staff

Dr. Jianhong An

Staff Scientist

Dr. Mikhail (Misha) Bilenky

Staff Scientist

Dr. Rohan Abraham

Research Associate

Dr. Sreeja Leelakumari

Research Associate

Dr. Kieran O'Neill

Process Development Coordinator

Sharon Ruschkowski

Bioinformatics Coordinator

Samantha Jones

Research Associate

Postdoctoral Fellows

Dr. Katherine Dixon

Postdoctoral Fellow

Dr. Jasleen Grewal

Postdoctoral Fellow

Vahid Akbari

Postdoctoral Fellow

Trainees

Sarah Dada

Graduate Student

Luka Culibrk

Graduate Student

Faeze Keshavarz

Graduate Student

Caralyn Reisle

Graduate Student

Jeremy Fan

Graduate Student

Glenn Chang

Graduate Student