BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Multiple mouse reference genomes defines subspecies specific haplo
 types and novel coding sequences - Dr Thomas Keane\, Sanger Institute
DTSTART:20180124T140000Z
DTEND:20180124T150000Z
UID:TALK93418@talks.cam.ac.uk
CONTACT:27743
DESCRIPTION:Anthony Doran1\, Thomas Keane1\,2\, and The Mouse Genomes Proj
 ect consortium\n1Wellcome Trust Sanger Institute\, Wellcome Genome Campus\
 , Hinxton\, UK\n2EMBL-EBI\, Wellcome Genome Campus\, Hinxton\, UK\n\nThe M
 ouse Genomes Project has completed the first draft assembled genome sequen
 ces and strain specific gene annotation for twelve classical laboratory an
 d four wild-derived inbred mouse strains (WSB/EiJ\, CAST/EiJ\, PWK/PhJ\, a
 nd SPRET/EiJ). These strains include all of the founders of the Collaborat
 ive Cross and Diversity Outbred Cross. We used a hybrid approach for genom
 e annotation\, combining evidence from the mouse reference Gencode annotat
 ion and strain-specific RNA-seq and PacBio cDNA\, to identify novel strain
 -specific gene structures and alleles. Approx. 20\,000 protein coding gene
 s and 45\,000 transcripts are annotated per strain. As these strains are f
 ully inbred\, we used heterozygous SNP density as a marker for highly poly
 morphic loci\, and identified 2\,907 candidate regions containing 1\,839 u
 nique protein coding genes. Defence and immunity was the largest represent
 ed protein class. Interestingly\, these regions are significantly enriched
  for repeat elements (LTRs and LINEs) which are known to facilitate accele
 rated recombination and sequence diversity\, key to population fitness and
  pathogen resistance. The assemblies have also been used to improve annota
 tion of the reference genome at many loci\, including 62 novel and 272 upd
 ated biotype annotations. In addition\, we have manually curated the wild-
 derived CAST/EiJ olfactory receptor repertoire on Chr11\, identifying nove
 l receptors not present in the C57BL/6J reference. Of particular note was 
 the discovery of a novel unannotated rodent specific 138 exon gene on Chr1
 1. Manual annotation extended this novel gene as a combination of the huma
 n genes EFCAB3 and EFCAB13 on human Chr17. Comparative analysis suggests t
 hese genes arose in the ancestor of all primates from a chromosomal rearra
 ngement which split this novel gene in two. Homozygous null mice\, using C
 RISPR\, show a male specific increase in lean mass phenotype. The genome s
 equences and annotation can be viewed in the UCSC and Ensembl genome brows
 ers.\n
LOCATION:MR4\, Centre for Mathematical Sciences\, Wilberforce Road\, Cambr
 idge
END:VEVENT
END:VCALENDAR
