Posters: Genome Sequencing & Biology at CSHL 2002
Posters presented at the Cold Spring Harbour Conference are available for download below (PDF, JPG).
Number 41: Fingerprinted BAC Clone Physical Maps
I. Bosdet, S. Barber, S. Chan, S. Chand, R. Chiu, A. Cloutier, C. Fjell, S. Flibotte, D. Fuhrmann, M. Krzywinski, D. Lee, C. Mathewson, T. Olson, K. Osoegawa, A. Prabhu, P. Saeedi, H. Shin, M. Tsai, N. Wye, P.J. de Jong, J. Schein, S. Jones and M. Marra
We are constructing high-resolution BAC-based physical maps employing an agarose gel-based fingerprinting methodology (Marra et al., 1997). This technology has been used to produce BAC-based maps of several genomes, including human, mouse and Cryptococcus neoformans. We are in the process of constructing a number of physical maps, including those of the rat and bovine genomes, and are fingerprinting currently 15,000 BAC clones per week.
The mouse physical map contains a total of 305,768 whole-clone HindIII fingerprints from two large-insert BAC libraries. Incorporated into the map are data for 16,997 markers provided by other researchers. Manual editing of the resulting contigs, performed at the Washington University Genome Sequence Center, has reduced the number of contigs to 325, spanning an estimated 99% of the mouse genome. This map provides a resource around which sequencing of the mouse genome is being organized.
Physical maps for the fungal pathogen Cryptococcus neoformans serotype A (H99) and serotype D (JEC21) were constructed in a similar manner. Markers and BAC-end sequence data have also been added to the two maps. These resources will be used for future comparative genomics studies of the genetics and virulence of this organism, and are assisting the sequencing of JEC21 at The Institute for Genomic Research (TIGR).
The rat physical map, which contains currently 181,888 clones, is being used to support sequencing of the rat genome. We are selecting, from the assembled contigs, a set of minimally overlapping BAC clones to be sequenced at the Baylor College of Medicine Human Genome Sequencing Centre. To date 11,431 fingerprinted clones have been selected from the FPC database fo sequencing, representing almost 65% of the entire rat genome.
For the bovine genome our goal is to generate a physical map containing 280,000 whole-clone HindIII fingerprints, of which 128,408 are complete. Mapped BAC clones will also be end-sequenced at TIGR. This map will be an important resource for future positional cloning and comparative genomics experiments, as well as assisting in future sequencing efforts.
All FPC fingerprint databases and associated data are updated on a weekly basis and are publicly available for download. FPC databases may also be viewed with our new Java-based program, Internet Contig Explorer (iCE).
Number 144: Identification of Genes Expressed in Early-Stage Lung Cancers
S. Jones, P. Ruzanov, J. Asano, Y. Butterfield, E. Garland, N. Girn, R. Guin, L. Hsiao, M. Krzywinski, W. Lam, S. Lam, S. Lee, K. Lonergan, C. MacAulay, T. Olson M. Oveisi, P. Pandoh, P. Saeedi, U. Skalska, L. Spence, D. Smailus, J. Stott, K. Teague, R, Varhol, G. Yang, S. Zuyderduyn, J. Schein, M. Marra
Number 160: A Set of Rearrayed BAC Clones Spanning the Human Genome
M. Krzywinski, J. Schein, I. Bosdet, D. Smailus, C. MacAulay, W. Lam, S. Jones, M. Marra
We are constructing a rearrayed set of BAC clones from the RPCI-11 (Osoegawa et al. (2001) Genome Res 11(3):483-496) and CalTechD (Knight and Lese et al. (2000) Am J Hum Genet, 67(2):320-332) libraries. These have been chosen from the human BAC physical map constructed at Washington University Genome Sequencing Centre (www.genome.wustl.edu). Our aim is to completely cover with these BACs the entire genome, as represented in the physical map, with minimal redundancy. We anticipate this resource will have several uses, including provision of a genome-ordered set of probes for fluorescent in-situ hybridization and provision of probes for microarray-based BAC-Comparative Genomic Hybridization (BAC-CGH) experiments.
The human BAC physical map is a curated and sequence-validated resource, well suited for selection of a minimal set of clones spanning the genome. Our approach involves walking along each contig and selecting clones in the following fashion. All "non-buried" clones in a contig are ordered and enumerated from the left end of the contig. Any "virtual" or clones unavailable to us are replaced with their buried counterparts, if the buried clone shares at least 98% of its bands and if the two clones have no fewer than 2 unshared bands. Band similarity is calculated using a tolerance of 7 standard mobility units. After this replacement, the left-most candidate clone is used as a starting point for a walk, which proceeds as far right as possible while maintaining 3 conserved bands between adjacent selected clones. The walk stops when a clone is encountered below this cutoff, or the end of the contig is reached. All the clones along the walk are evaluated against a size limit (100-200kbp) and number of bands limit (20-50). Based on these criteria, the right-most eligible RPCI-11/CalTechD clone is picked. This clone becomes the start of the next walk and the algorithm continues until the end of the contig is reached.
The rearray is evaluated for genome coverage by calculating overlap between clones picked from the map and clones in the rearray. Preliminary results using a minimum of 3 conserved bands between walk picks yielded a rearray set of 24,607 clones totaling 3.5 Gbp (1.2x genome coverage). 97% of clones designated FULL_X in the human physical map were matched by a rearray clone from the same contig at a cutoff of 10-10. 96% of 40,000 "non-buried" clones chosen at random from the human physical map matched a same-contig rearray clone at a cutoff of 10-7.
The evaluation of the rearray demonstrates extensive coverage of the human genome. We are currently evaluating the rearray in more detail and plan to use the human sequence data to increase the precision of the coverage calculations.
Number 261: Transposon-Mediated cDNA Sequencing
D. Smailus, J. Asano, Y. Butterfield, N. Girn, R. Guin, M. Krzywinski, S. Lee, K. MacDonald, T. Olson, P. Pandoh, P. Saeedi, U. Skalska, L. Spence, J. Stott, S. Taylor, K. Teague, G. Yang, J. Schein, S. Jones and M. Marra.
We have developed an efficient, high-throughput method for accurate DNA sequencing of entire cDNA clones through our participation in the NCI-sponsored Mammalian Gene Collection. Sequencing is accomplished though the insertion of Mu transposon into cDNAs, followed by sequencing reactions primed with Mu-specific sequencing primers. Transposon insertion reactions are not performed with individual cDNAs but rather on pools of up to 96 clones. Accurate clone insert size and DNA quantitation data are used to ensure proportional representation of each cDNA clone in the pool. This pooling strategy reduces the number of transposon insertion sequencing libraries that would otherwise be required, reducing the costs and enhancing the efficiency of the transposon library construction procedure. Sequences are assembled using Phred, Phrap, and Consed to yield the full-length cDNA sequence, with sequence editing and other sequence finishing activities performed as required to resolve sequence ambiguities. We are currently in our second year of the MGC project and have used the method to generate more than 7.5 Mb of finished sequence from 3,956 candidate full-length cDNAs. Analysis of 22,785 sequenced Mu transposon insertion events revealed a weak sequence preference for Mu insertion. However, the insertion pattern deviates only slightly from random and does not adversely affect the efficacy of our method. A detailed description of our transposon-mediated sequencing methodology and analysis of Mu transposon insertion events will be presented