University of Cambridge > Talks.cam > Computational and Systems Biology > Multiple mouse reference genomes defines subspecies specific haplotypes and novel coding sequences

Multiple mouse reference genomes defines subspecies specific haplotypes and novel coding sequences

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact .

Anthony Doran1, Thomas Keane1,2, and The Mouse Genomes Project consortium 1Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, UK 2EMBL-EBI, Wellcome Genome Campus, Hinxton, UK

The Mouse Genomes Project has completed the first draft assembled genome sequences and strain specific gene annotation for twelve classical laboratory and four wild-derived inbred mouse strains (WSB/EiJ, CAST /EiJ, PWK /PhJ, and SPRET /EiJ). These strains include all of the founders of the Collaborative Cross and Diversity Outbred Cross. We used a hybrid approach for genome annotation, combining evidence from the mouse reference Gencode annotation and strain-specific RNA -seq and PacBio cDNA, to identify novel strain-specific gene structures and alleles. Approx. 20,000 protein coding genes and 45,000 transcripts are annotated per strain. As these strains are fully inbred, we used heterozygous SNP density as a marker for highly polymorphic loci, and identified 2,907 candidate regions containing 1,839 unique protein coding genes. Defence and immunity was the largest represented protein class. Interestingly, these regions are significantly enriched for repeat elements (LTRs and LIN Es) which are known to facilitate accelerated recombination and sequence diversity, key to population fitness and pathogen resistance. The assemblies have also been used to improve annotation of the reference genome at many loci, including 62 novel and 272 updated biotype annotations. In addition, we have manually curated the wild-derived CAST /EiJ olfactory receptor repertoire on Chr11, identifying novel receptors not present in the C57BL /6J reference. Of particular note was the discovery of a novel unannotated rodent specific 138 exon gene on Chr11. Manual annotation extended this novel gene as a combination of the human genes EFCAB3 and EFCAB13 on human Chr17. Comparative analysis suggests these genes arose in the ancestor of all primates from a chromosomal rearrangement which split this novel gene in two. Homozygous null mice, using CRISPR , show a male specific increase in lean mass phenotype. The genome sequences and annotation can be viewed in the UCSC and Ensembl genome browsers.

This talk is part of the Computational and Systems Biology series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2017 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity