Statistics and figures for sequencing library: HS1048 (A431_IRTR)
Summary of read assignments by assignment class
Read Class | Read Count | Percent of Total |
Total read count | 11,246,214 | 100% |
Read1-Read2 identical | 16,022 | 0.14% |
Low complexity | 76,193 | 0.68% |
Low quality (>1 N) | 411,841 | 3.66% |
Ensembl transcript | 6,230,633 | 55.40% |
Ensembl transcript (ambiguous) | 506,581 | 4.50% |
Novel exon junction | 745 | 0.01% |
Novel exon junction (ambiguous) | 30 | 0.00% |
Novel exon boundary extension | 22,168 | 0.20% |
Novel exon boundary extension (ambiguous) | 1,339 | 0.01% |
Intron | 802,612 | 7.14% |
Intron (ambiguous) | 16,374 | 0.15% |
Intergenic | 374,087 | 3.33% |
Intergenic (ambiguous) | 8,574 | 0.08% |
Repeat element | 259,109 | 2.30% |
Repeat element (ambiguous) | 92,942 | 0.83% |
Unassigned | 2,426,962 | 21.58% |
Summary of average coverage values by feature type
Feature Type | Average Coverage | Total Base Count | Cumulative Coverage |
Gene | 4.42108 | 70,657,218 | 312,381,234 |
Transcript | 3.65280 | 46,317,996 | 169,190,462 |
ExonRegion | 4.42108 | 70,657,218 | 312,381,234 |
Junction | 0.32408 | 137,098,120 | 44,430,962 |
KnownJunction | 3.25238 | 13,544,706 | 44,052,547 |
NovelJunction | 0.00306 | 123,553,414 | 378,415 |
Boundary | 0.31488 | 26,598,682 | 8,375,402 |
KnownBoundary | 3.18211 | 1,328,722 | 4,228,143 |
NovelBoundary | 0.16412 | 25,269,960 | 4,147,259 |
Intron | 0.03631 | 1,089,861,286 | 39,571,470 |
ActiveIntronRegion | 0.16764 | 79,490,299 | 13,326,005 |
SilentIntronRegion | 0.02597 | 1,009,728,466 | 26,226,477 |
Intergenic | 0.00993 | 1,852,676,967 | 18,404,073 |
ActiveIntergenicRegion | 0.24745 | 43,133,494 | 10,673,555 |
SilentIntergenicRegion | 0.00427 | 1,809,415,481 | 7,721,539 |
Summary of expressed events by feature type
Feature Type | Feature Count | Expressed (%) | Not Expressed (%) |
Gene | 36,954 | 9,831 (26.60%) | 27,123 (73.40%) |
Transcript | 62,372 | 6,950 (11.14%) | 55,422 (88.86%) |
ExonRegion | 277,805 | 111,077 (39.98%) | 166,728 (60.02%) |
Junction | 2,211,261 | 85,400 (3.86%) | 2,125,861 (96.14%) |
KnownJunction | 218,464 | 84,459 (38.66%) | 134,005 (61.34%) |
NovelJunction | 1,992,798 | 941 (0.05%) | 1,991,857 (99.95%) |
Boundary | 429,012 | 14,894 (3.47%) | 414,118 (96.53%) |
KnownBoundary | 21,432 | 5,868 (27.38%) | 15,564 (72.62%) |
NovelBoundary | 407,581 | 9,026 (2.21%) | 398,555 (97.79%) |
Intron | 204,178 | 577 (0.28%) | 203,601 (99.72%) |
ActiveIntronRegion | 250,770 | 3,180 (1.27%) | 247,590 (98.73%) |
SilentIntronRegion | 312,398 | 1,024 (0.33%) | 311,374 (99.67%) |
Intergenic | 28,382 | 144 (0.51%) | 28,238 (99.49%) |
ActiveIntergenic | 123,007 | 2,773 (2.25%) | 120,234 (97.75%) |
SilentIntergenic | 110,401 | 820 (0.74%) | 109,581 (99.26%) |
GRAND TOTAL (non-redundant) | 4,046,540 | 236,670 (5.85%) | 3,809,870 (94.15%) |
Estimates of intronic and intergenic noise levels (95th percentiles of silent intron and intergenic regions)
95th percentile of silent intron regions for library (HS1048) is: 10.56 (log2 = 3.4)
95th percentile of silent intergenic regions for library (HS1048) is: 12.47 (log2 = 3.64)
Estimates of signal-to-noise ratio
(Average coverage of exon regions / Average coverage of silent intron regions) = 170.24
(Average coverage of exon regions / Average coverage of silent intergenic regions) = 1035.38
Distribution of log2 raw expression values for each feature type
Box-and-whisker plots for log2 expression values for each feature type.
Distribution of log2 normalized expression values for each feature type
Box-and-whisker plots for normalized log2 expression values for each feature type.
Density scatter plot of exon region versus gene expression values
Density scatter plot of log2 expression values for exon regions versus corresponding gene expression values.
Density scatter plot of silent intron region versus gene expression values
Density scatter plot of log2 expression values for silent intron regions versus corresponding gene expression values.
Distribution of gene-by-gene expression cutoff values
Histogram depicting the distribution of expression cutoff values used for each gene. In order to be considered 'expressed above background', all features (genes, transcripts, exons, junctions, etc.) must be expressed above the level of INTERGENIC noise. This INTERGENIC cutoff is the 95th percentile of expression values for all Silent Intergenic Regions in this library and is depicted below as a dotted red line. The number of genes for which only INTERGENIC noise is considered is provided in the legend. For features within the boundaries of highly expressed genes, additional noise is expected due to the presence of un-processed RNA contamination. For this reason, a higher INTRAGENIC cutoff is determined. These are calculated by fitting a linear model to the 95th percentile of expression values for silent intronic regions. The INTRAGENIC cutoff for a gene is then determined by using the gene expression level and the coefficients of the model fit. The distribution of the resulting gene-by-gene cutoffs is depicted below as a histogram. The number of genes that required an INTRAGENIC cutoff is also indicated in the legend.
Histograms of expression values for each feature type
Histograms depicting distribution of log2 expression values for individual feature types. The 95th percentile of expression values for silent intergenic region is depicted as a dotted line on all plots.
Percentiles plot for expression of exon regions, silent intron regions and silent intergenic regions
Percentiles plot for exon region, silent intron region and silent intergenic region expression values. The 95th percentiles of intronic and intergenic distributions are depicted as colored, dotted lines