With more than 35 peer-reviewed scientific publications, findings from the POG program are influencing precision oncology approaches around the world.
Advanced and metastatic tumors with complex treatment histories drive cancer mortality. Here we describe the POG570 cohort, a comprehensive whole-genome, transcriptome and clinical dataset, amenable for exploration of the impacts of therapies on genomic landscapes. Previous exposure to DNA-damaging chemotherapies and mutations affecting DNA repair genes, including POLQ and genes encoding Polζ, were associated with genome-wide, therapy-induced mutagenesis. Exposure to platinum therapies coincided with signatures SBS31 and DSB5 and, when combined with DNA synthesis inhibitors, signature SBS17b. Alterations in ESR1, EGFR, CTNNB1, FGFR1, VEGFA and DPYD were consistent with drug resistance and sensitivity. Recurrent noncoding events were found in regulatory region hotspots of genes including TERT, PLEKHS1, AP2A1 and ADGRG6. Mutation burden and immune signatures corresponded with overall survival and response to immunotherapy. Our data offer a rich resource for investigation of advanced cancers and interpretation of whole-genome and transcriptome sequencing in the context of a cancer clinic.
Introduction Given the high level of uncertainty surrounding the outcomes of early phase clinical trials, whole genome and transcriptome analysis (WGTA) can be used to optimize patient selection and study assignment. In this retrospective analysis, we reviewed the impact of this approach on one such program. Methods Patients with advanced malignancies underwent fresh tumor biopsies as part of our personalized medicine program (NCT02155621). Tumour molecular data were reviewed for potentially clinically actionable findings and patients were referred to the developmental therapeutics program. Outcomes were reviewed in all patients, including those where trial selection was driven by molecular data (matched) and those where there was no clear molecular rationale (unmatched). Results From January 2014 to January 2018, 28 patients underwent WGTA and enrolled in clinical trials, including 2 patients enrolled in two trials. Fifteen patients were matched to a treatment based on a molecular target. Five patients were matched to a trial based upon single-gene DNA changes, all supported by RNA data. Ten cases were matched on the basis of genome-wide data (n = 4) or RNA gene expression only (n = 6). With a median follow-up of 6.7 months, the median time on treatment was 8.2 weeks. Discussion When compared to single-gene DNA-based data alone, WGTA led to a 3-fold increase in treatment matching. In a setting where there is a high level of uncertainty around both the investigational agents and the biomarkers, more data are needed to fully evaluate the impact of routine use of WGTA.
Effective management of brain and spine tumors relies on a multidisciplinary approach encompassing surgery, radiation, and systemic therapy. In the era of personalized oncology, the latter is complemented by various molecularly targeting agents. Precise identification of cellular targets for these drugs requires comprehensive profiling of the cancer genome coupled with an efficient analytic pipeline, leading to an informed decision on drug selection, prognosis, and confirmation of the original pathological diagnosis. Acquisition of optimal tumor tissue for such analysis is paramount and often presents logistical challenges in neurosurgery. Here, we describe the experience and results of the Personalized OncoGenomics (POG) program with a focus on tumors of the central nervous system (CNS). Patients with recurrent CNS tumors were consented and enrolled into the POG program prior to accrual of tumor and matched blood followed by whole-genome and transcriptome sequencing and processing through the POG bioinformatic pipeline. Sixteen patients were enrolled into POG. In each case, POG analyses identified genomic drivers including novel oncogenic fusions, aberrant pathways, and putative therapeutic targets. POG has highlighted that personalized oncology is truly a multidisciplinary field, one in which neurosurgeons must play a vital role if these programs are to succeed and benefit our patients.
Pancreatic neuroendocrine neoplasms (PanNENs) represent a minority of pancreatic neoplasms that exhibit variability in prognosis. Ongoing mutational analyses of PanNENs have found recurrent abnormalities in chromatin remodeling genes (e.g., DAXX and ATRX), and mTOR pathway genes (e.g., TSC2, PTEN PIK3CA, and MEN1), some of which have relevance to patients with related familial syndromes. Most recently, grade 3 PanNENs have been divided into two groups based on differentiation, creating a new group of well-differentiated grade 3 neuroendocrine tumors (PanNETs) that have had a limited whole-genome level characterization to date. In a patient with a metastatic well-differentiated grade 3 PanNET, our study utilized whole-genome sequencing of liver metastases for the comparative analysis and detection of single-nucleotide variants, insertions and deletions, structural variants, and copy-number variants, with their biologic relevance confirmed by RNA sequencing. We found that this tumor most notably exhibited a TSC1-disrupting fusion, showed a novel CHD7-BEND2 fusion, and lacked any somatic variants in ATRX, DAXX, and MEN1.
We report a case of early-onset pancreatic ductal adenocarcinoma in a patient harboring biallelic MUTYH germline mutations, whose tumor featured somatic mutational signatures consistent with defective MUTYH-mediated base excision repair and the associated driver KRAS transversion mutation p.Gly12Cys. Analysis of an additional 730 advanced cancer cases (N = 731) was undertaken to determine whether the mutational signatures were also present in tumors from germline MUTYH heterozygote carriers or if instead the signatures were only seen in those with biallelic loss of function. We identified two patients with breast cancer each carrying a pathogenic germline MUTYH variant with a somatic MUTYH copy loss leading to the germline variant being homozygous in the tumor and demonstrating the same somatic signatures. Our results suggest that monoallelic inactivation of MUTYH is not sufficient for C:G>A:T transversion signatures previously linked to MUTYH deficiency to arise (N = 9), but that biallelic complete loss of MUTYH function can cause such signatures to arise even in tumors not classically seen in MUTYH-associated polyposis (N = 3). Although defective MUTYH is not the only determinant of these signatures, MUTYH germline variants may be present in a subset of patients with tumors demonstrating elevated somatic signatures possibly suggestive of MUTYH deficiency (e.g., COSMIC Signature 18, SigProfiler SBS18/SBS36, SignatureAnalyzer SBS18/SBS36).
Head and neck squamous cell carcinoma (HNSCC) is one of the most common cancers worldwide and represents a heterogeneous group of tumors, the majority of which are treated with a combination of surgery, radiation, and chemotherapy. Fluoropyrimidine (5-FU) and its oral prodrug, capecitabine, are commonly prescribed treatments for several solid tumor types including HNSCC. 5-FU-associated toxicity is observed in ∼30% of treated patients and is largely caused by germline polymorphisms in DPYD, which encodes dihydropyrimidine dehydrogenase, a key enzyme of 5-FU catabolism and deactivation. Although the association of germline DPYD alterations with toxicity is well-described, the potential contribution of somatic DPYD alterations to 5-FU sensitivity has not been explored. In a patient with metastatic HNSCC, in-depth genomic and transcriptomic integrative analysis on a biopsy from a metastatic neck lesion revealed alterations in genes that are associated with 5-FU uptake and metabolism. These included a novel somatic structural variant resulting in a partial deletion affecting DPYD, a variant of unknown significance affecting SLC29A1, and homozygous deletion of MTAP There was no evidence of deleterious germline polymorphisms that have been associated with 5-FU toxicity, indicating a potential vulnerability of the tumor to 5-FU therapy. The discovery of the novel DPYD variant led to the initiation of 5-FU treatment that resulted in a rapid response lasting 17 wk, with subsequent relapse due to unknown resistance mechanisms. This suggests that somatic alterations present in this tumor may serve as markers for tumor sensitivity to 5-FU, aiding in the selection of personalized treatment strategies.
Purpose: Identification of clinically actionable molecular subtypes of pancreatic ductal adenocarcinoma (PDAC) is key to improving patient outcome. Intertumoral metabolic heterogeneity contributes to cancer survival and the balance between distinct metabolic pathways may influence PDAC outcome. We hypothesized that PDAC can be stratified into prognostic metabolic subgroups based on alterations in the expression of genes involved in glycolysis and cholesterol synthesis.
Experimental design: We performed bioinformatics analysis of genomic, transcriptomic, and clinical data in an integrated cohort of 325 resectable and nonresectable PDAC. The resectable datasets included retrospective The Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC) cohorts. The nonresectable PDAC cohort studies included prospective COMPASS, PanGen, and BC Cancer Personalized OncoGenomics program (POG).
Results: On the basis of the median normalized expression of glycolytic and cholesterogenic genes, four subgroups were identified: quiescent, glycolytic, cholesterogenic, and mixed. Glycolytic tumors were associated with the shortest median survival in resectable (log-rank test P = 0.018) and metastatic settings (log-rank test P = 0.027). Patients with cholesterogenic tumors had the longest median survival. KRAS and MYC-amplified tumors had higher expression of glycolytic genes than tumors with normal or lost copies of the oncogenes (Wilcoxon rank sum test P = 0.015). Glycolytic tumors had the lowest expression of mitochondrial pyruvate carriers MPC1 and MPC2. Glycolytic and cholesterogenic gene expression correlated with the expression of prognostic PDAC subtype classifier genes.
Conclusions: Metabolic classification specific to glycolytic and cholesterogenic pathways provides novel biological insight into previously established PDAC subtypes and may help develop personalized therapies targeting unique tumor metabolic profiles.
This study investigated therapeutic potential of integrated genome and transcriptome profiling of metastatic sarcoma, a rare but extremely heterogeneous group of aggressive mesenchymal malignancies with few systemic therapeutic options.
Forty-three adult patients with advanced or metastatic non-GI stromal tumor sarcomas of various histology subtypes who were enrolled in the Personalized OncoGenomics program at BC Cancer were included in this study. Fresh tumor tissues along with blood samples underwent whole-genome and transcriptome sequencing.
The most frequent genomic alterations in this cohort are large-scale structural variation and somatic copy number variation. Outlier RNA expression as well as somatic copy number variations, structural variations, and small mutations together suggest the presence of one or more potential therapeutic targets in the majority of patients in our cohort. Point mutations or deletions in known targetable cancer genes are rare; for example, tuberous sclerosis complex 2 provides a rationale for targeting the mammalian target of rapamycin pathway, resulting in a few patients with exceptional clinical benefit from everolimus. In addition, we observed recurrent 17p11-12 amplifications, which seem to be a sarcoma-specific event. This may suggest that this region harbors an oncogene(s) that is significant for sarcoma tumorigenesis. Furthermore, some sarcoma tumors carrying a distinct mutational signature suggestive of homologous recombination deficiency seem to demonstrate sensitivity to double-strand DNA–damaging agents.
Integrated large-scale genomic analysis may provide insights into potential therapeutic targets as well as novel biologic features of metastatic sarcomas that could fuel future experimental and clinical research and help design biomarker-driven basket clinical trials for novel therapeutic strategies.
Purpose: Gene fusions involving neuregulin 1 (NRG1) have been noted in multiple cancer types and have potential therapeutic implications. Although varying results have been reported in other cancer types, the efficacy of the HER-family kinase inhibitor afatinib in the treatment of NRG1 fusion-positive pancreatic ductal adenocarcinoma is not fully understood.
Experimental design: Forty-seven patients with pancreatic ductal adenocarcinoma received comprehensive whole-genome and transcriptome sequencing and analysis. Two patients with gene fusions involving NRG1 received afatinib treatment, with response measured by pretreatment and posttreatment PET/CT imaging.
Results: Three of 47 (6%) patients with advanced pancreatic ductal adenocarcinoma were identified as KRAS wild type by whole-genome sequencing. All KRAS wild-type tumors were positive for gene fusions involving the ERBB3 ligand NRG1. Two of 3 patients with NRG1 fusion-positive tumors were treated with afatinib and demonstrated a significant and rapid response while on therapy.
Conclusions: This work adds to a growing body of evidence that NRG1 gene fusions are recurrent, therapeutically actionable genomic events in pancreatic cancers. Based on the clinical outcomes described here, patients with KRAS wild-type tumors harboring NRG1 gene fusions may benefit from treatment with afatinib.
Importance: A molecular diagnostic method that incorporates information about the transcriptional status of all genes across multiple tissue types can strengthen confidence in cancer diagnosis.
Objective: To determine the practical use of a whole transcriptome-based pan-cancer method in diagnosing primary and metastatic cancers and resolving complex diagnoses.
Design, setting, and participants: This cross-sectional diagnostic study assessed Supervised Cancer Origin Prediction Using Expression (SCOPE), a machine learning method using whole-transcriptome RNA sequencing data. Training was performed on publicly available primary cancer data sets, including The Cancer Genome Atlas. Testing was performed retrospectively on untreated primary cancers and treated metastases from volunteer adult patients at BC Cancer in Vancouver, British Columbia, from January 1, 2013, to March 31, 2016, and testing spanned 10 822 samples and 66 output classes representing untreated primary cancers (n = 40) and adjacent normal tissues (n = 26). SCOPE's performance was demonstrated on 211 untreated primary mesothelioma cancers and 201 treatment-resistant metastatic cancers. Finally, SCOPE was used to identify the putative site of origin in 15 cases with initial presentation as cancers with unknown primary of origin.
Results: A total of 10 688 adult patient samples representing 40 untreated primary tumor types and 26 adjacent-normal tissues were used for training. Demographic data were not available for all data sets. Among the training data set, 5157 of 10 244 (50.3%) were male and the mean (SD) age was 58.9 (14.5) years. Testing was performed on 211 patients with untreated primary mesothelioma (173 [82.0%] male; mean [SD] age, 64.5 [11.3] years); 201 patients with treatment-resistant cancers (141 [70.1%] female; mean [SD] age, 55.6 [12.9] years); and 15 patients with cancers of unknown primary of origin; among the treatment-resistant cancers, 168 were metastatic, and 33 were the primary presentation. An accuracy rate of 99% was obtained for primary epithelioid mesotheliomas tested (125 of 126). The remaining 85 mesotheliomas had a mixed etiology (sarcomatoid mesotheliomas) and were correctly identified as a mixture of their primary components, with potential implications in resolving subtypes and incidences of mixed histology. SCOPE achieved an overall mean (SD) accuracy rate of 86% (11%) and F1 score of 0.79 (0.12) on the 201 treatment-resistant cancers and matched 12 of 15 of the putative diagnoses for cancers with indeterminate diagnosis from conventional pathology.
Conclusions and relevance: These results suggest that machine learning approaches incorporating multiple tumor profiles can more accurately identify the cancerous state and discriminate it from normal cells. SCOPE uses the whole transcriptomes from normal and tumor tissues, and results of this study suggest that it performs well for rare cancer types, primary cancers, treatment-resistant metastatic cancers, and cancers of unknown primary of origin. Genes most relevant in SCOPE's decision making were examined, and several are known biological markers of respective cancers. SCOPE may be applied as an orthogonal diagnostic method in cases where the site of origin of a cancer is unknown, or when standard pathology assessment is inconclusive.
Pancreatic adenocarcinoma presents as a spectrum of a highly aggressive disease in patients. The basis of this disease heterogeneity has proved difficult to resolve due to poor tumor cellularity and extensive genomic instability. To address this, a dataset of whole genomes and transcriptomes was generated from purified epithelium of primary and metastatic tumors. Transcriptome analysis demonstrated that molecular subtypes are a product of a gene expression continuum driven by a mixture of intratumoral subpopulations, which was confirmed by single-cell analysis. Integrated whole-genome analysis uncovered that molecular subtypes are linked to specific copy number aberrations in genes such as mutant KRAS and GATA6. By mapping tumor genetic histories, tetraploidization emerged as a key mutational process behind these events. Taken together, these data support the premise that the constellation of genomic aberrations in the tumor gives rise to the molecular subtype, and that disease heterogeneity is due to ongoing genomic instability during progression.
Importance: Pediatric cancers are epigenetic diseases; therefore, considering tumor gene expression information is necessary for a complete understanding of the tumorigenic processes.
Objective: To evaluate the feasibility and utility of incorporating comparative gene expression information into the precision medicine framework for difficult-to-treat pediatric and young adult patients with cancer.
Design, setting, and participants: This cohort study was conducted as a consortium between the University of California, Santa Cruz (UCSC) Treehouse Childhood Cancer Initiative and clinical genomic trials. RNA sequencing (RNA-Seq) data were obtained from the following 4 clinical sites and analyzed at UCSC: British Columbia Children's Hospital (n = 31), Lucile Packard Children's Hospital at Stanford University (n = 80), CHOC Children's Hospital and Hyundai Cancer Institute (n = 46), and the Pacific Pediatric Neuro-Oncology Consortium (n = 24). The study dates were January 1, 2016, to March 22, 2017.
Exposures: Participants underwent tumor RNA-Seq profiling as part of 4 separate clinical trials at partner hospitals. The UCSC either downloaded RNA-Seq data from a partner institution for analysis in the cloud or provided a Docker pipeline that performed the same analysis at a partner institution. The UCSC then compared each participant's tumor RNA-Seq profile with more than 11 000 uniformly analyzed tumor profiles from pediatric and young adult patients with cancer, downloaded from public data repositories. These comparisons were used to identify genes and pathways that are significantly overexpressed in each patient's tumor. Results of the UCSC analysis were presented to clinical partners.
Main outcomes and measures: Feasibility of a third-party institution (UCSC Treehouse Childhood Cancer Initiative) to obtain tumor RNA-Seq data from patients, conduct comparative analysis, and present analysis results to clinicians; and proportion of patients for whom comparative tumor gene expression analysis provided useful clinical and biological information.
Results: Among 144 samples from children and young adults (median age at diagnosis, 9 years; range, 0-26 years; 72 of 118 [61.0%] male [26 patients sex unknown]) with a relapsed, refractory, or rare cancer treated on precision medicine protocols, RNA-Seq-derived gene expression was potentially useful for 99 of 144 samples (68.8%) compared with DNA mutation information that was potentially useful for only 34 of 74 samples (45.9%).
Conclusions and relevance: This study's findings suggest that tumor RNA-Seq comparisons may be feasible and highlight the potential clinical utility of incorporating such comparisons into the clinical genomic interpretation framework for difficult-to-treat pediatric and young adult patients with cancer. The study also highlights for the first time to date the potential clinical utility of harmonized publicly available genomic data sets.
The analysis of cell-free circulating tumor DNA (ctDNA) is potentially a less invasive, more dynamic assessment of cancer progression and treatment response than characterizing solid tumor biopsies. Standard isolation methods require separation of plasma by centrifugation, a time-consuming step that complicates automation. To address these limitations, we present an automatable magnetic bead-based ctDNA isolation method that eliminates centrifugation to purify ctDNA directly from peripheral blood (PB). To develop and test our method, ctDNA from cancer patients was purified from PB and plasma. We found that allelic fractions of somatic single-nucleotide variants from target gene capture libraries were comparable, indicating that the PB ctDNA purification method may be a suitable replacement for the plasma-based protocols currently in use.
Next generation RNA-sequencing (RNA-seq) is a flexible approach that can be applied to a range of applications including global quantification of transcript expression, the characterization of RNA structure such as splicing patterns and profiling of expressed mutations. Many RNA-seq protocols require up to microgram levels of total RNA input amounts to generate high quality data, and thus remain impractical for the limited starting material amounts typically obtained from rare cell populations, such as those from early developmental stages or from laser micro-dissected clinical samples. Here, we present an assessment of the contemporary ribosomal RNA depletion-based protocols, and identify those that are suitable for inputs as low as 1-10 ng of intact total RNA and 100-500 ng of partially degraded RNA from formalin-fixed paraffin-embedded tissues.
Tissues used in pathology laboratories are typically stored in the form of formalin-fixed, paraffin-embedded (FFPE) samples. One important consideration in repurposing FFPE material for next generation sequencing (NGS) analysis is the sequencing artifacts that can arise from the significant damage to nucleic acids due to treatment with formalin, storage at room temperature and extraction. One such class of artifacts consists of chimeric reads that appear to be derived from non-contiguous portions of the genome. Here, we show that a major proportion of such chimeric reads align to both the 'Watson' and 'Crick' strands of the reference genome. We refer to these as strand-split artifact reads (SSARs). This study provides a conceptual framework for the mechanistic basis of the genesis of SSARs and other chimeric artifacts along with supporting experimental evidence, which have led to approaches to reduce the levels of such artifacts. We demonstrate that one of these approaches, involving S1 nuclease-mediated removal of single-stranded fragments and overhangs, also reduces sequence bias, base error rates, and false positive detection of copy number and single nucleotide variants. Finally, we describe an analytical approach for quantifying SSARs from NGS data.
Curation and storage of formalin-fixed, paraffin-embedded (FFPE) samples are standard procedures in hospital pathology laboratories around the world. Many thousands of such samples exist and could be used for next generation sequencing analysis. Retrospective analyses of such samples are important for identifying molecular correlates of carcinogenesis, treatment history and disease outcomes. Two major hurdles in using FFPE material for sequencing are the damaged nature of the nucleic acids and the labor-intensive nature of nucleic acid purification. These limitations and a number of other issues that span multiple steps from nucleic acid purification to library construction are addressed here. We optimized and automated a 96-well magnetic bead-based extraction protocol that can be scaled to large cohorts and is compatible with automation. Using sets of 32 and 91 individual FFPE samples respectively, we generated libraries from 100 ng of total RNA and DNA starting amounts with 95-100% success rate. The use of the resulting RNA in micro-RNA sequencing was also demonstrated. In addition to offering the potential of scalability and rapid throughput, the yield obtained with lower input requirements makes these methods applicable to clinical samples where tissue abundance is limiting.