Short Description

This experiment spawned from the investigation of the role of NFKBIZ in DLBCL. We are using the DLC DLBCL cohort (n = 348) to look for mutation patterns in relation to NFKBIZ. One critical feature we are considering is the DLBCL cell-of-origin (COO). We have COO status for every DLC sample based on the results of the Lymph2Cx NanoString assay. This assay measures RNA levels for 20 genes from FFPE tissue in order to classify the sample as either ABC, GCB, unclassified (UNC) or undefined (NA).

However, Ryan has expressed concern that this assay is undercalling ABC DLBCLs. Fortunately, we have RNA-seq data for 322 DLC samples (out of 348). We also have mutation data for 83 known or candidate lymphoma genes from a targeted sequencing experiment (hybrid capture) for the entire cohort. Therefore, we have sufficient data to attempt to refine the COO assignments of the DLC samples.

Shared Variables

Below is the list of arguments that will be shared for the entirety of this analysis.

# File paths
expr_path <- file.path(PROJHOME, "data", "expr.tsv")
coo_path <- file.path(PROJHOME, "data", "coo.tsv")
nc_path <- file.path(PROJHOME, "data", "normal_content.tsv")
muts_path <- file.path(PROJHOME, "data", "mutations.tsv")
snvs_and_indels_path <- file.path(PROJHOME, "data", "snvs_and_indels.maf")
cnvs_and_svs_path <- file.path(PROJHOME, "data", "cnvs_and_svs.tsv")
wright_gene_ids_path <- file.path(PROJHOME, "reference", "wright_genes.txt")
lymph2cx_gene_ids_path <- file.path(PROJHOME, "reference", "lymph2cx_genes.txt")
nfkb_gene_ids_path <- file.path(PROJHOME, "reference", "nfkb_genes.txt")

# Random number generated by runif(1, 0, 10^8)
global_seed <- 87510475
set.seed(global_seed)

# Number of threads used when possible
num_cores <- parallel::detectCores()
doMC::registerDoMC(cores = num_cores)

# Number of discrete levels when converting numeric vector
n_breaks <- 5

# Minimum fraction of affected samples for a codon to be called a hotspot
min_recur <- 0.05

# Number of most variably expressed genes to display in heatmaps
ntop <- 500

# Method for multiple test correction
p_adjust_method <- "BH"

# Q-value cutoff
qval_cutoff <- 0.1