BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:The push to pool: Testing the effects of matched and mismatched re
 ference populations in forensic voice comparison - Dr Dominic Watt (The Un
 iversity of York)
DTSTART:20181108T170000Z
DTEND:20181108T183000Z
UID:TALK113995@talks.cam.ac.uk
CONTACT:Yixin Zhang
DESCRIPTION:The use of Automatic Speaker Recognition (ASR) software system
 s in forensic speaker comparison casework is expanding internationally (Hu
 ghes et al. 2018). Its uptake is dependent upon the availability of approp
 riate reference databases\, since making valid assessments of the similari
 ty of the speech of known and unknown talkers hangs upon how typical the s
 peech samples are with respect to the relevant population. Ideally\, we wo
 uld have databases at our disposal which are closely matched to the accent
 (s) to be heard in the samples\, and also comparable in terms of factors s
 uch as recording channel and speaking style. However\, in many cases the a
 vailable reference databases are small\, dated\, fragmentary\, or composed
  of inappropriate material\, thereby compromising the quality of our ASR-b
 ased comparisons.\n\nThe extent to which the reliability and accuracy of c
 omparisons is affected by the characteristics of matched and mismatched re
 ference databases is currently the focus of investigation by a number of g
 roups (e.g. Enzinger & Morrison 2017\; van der Vloed et al. 2017\; Hughes 
 et al. 2018). It is clear that the results reported by ASR systems are sen
 sitive to the nature of the reference data used\, in terms of parameters s
 uch as speaker accent\, sample duration\, database size\, and channel char
 acteristics. From the point of view of obtaining greater statistical power
  and correspondingly higher levels of confidence in the results of ASR com
 parisons\, it is reasonable to suppose that bigger reference databases are
  superior to small ones. In pursuit of this goal\, we might wish for pract
 ical reasons to pool two or more pre-existing corpora\, rather than devoti
 ng resources to collecting new material.\n\nBut how valid it is to follow 
 this strategy? How much difference to the output of our ASR system does it
  make if the pooled corpora in question are mismatched for speaker accent?
  If the difference turns out to be negligible\, we might decide to combine
  accent-mismatched corpora as a matter of routine. Alternatively\, if exce
 ssive heterogeneity in the reference corpus degrades system performance we
  might advocate collecting bespoke corpora\, perhaps even at the level of 
 individual cases\, as has been argued for by Morrison (2018). Obtaining ca
 se-specific corpora has major time and cost implications that may render t
 he latter approach unfeasible\, however.\n\nThus far\, the consequences of
  combining apparently incompatible databases so as to maximise the size of
  the reference population have not been fully explored. In this paper I re
 port on a study assessing the extent to which the performance of a leading
  ASR software package was affected by rolling the ‘Dynamic Variability i
 n Speech’ (DyViS) database (Nolan et al. 2009) of recordings of 100 youn
 g (18-25 year old) male speakers of Standard Southern British English toge
 ther with a newly-collected corpus of recordings of speakers from three ur
 ban communities in North-East England (Newcastle\, Sunderland\, Middlesbro
 ugh) gathered for the ongoing ‘The Use and Utility of Localised Speech F
 orms in Determining Identity: Forensic and Sociophonetic Perspectives’ (
 TUULS) project. The results are encouraging in the sense that even using a
  mixed-accent reference population yields good system performance\, though
  it is acknowledged that using more forensically-realistic samples might l
 ead us to draw less optimistic conclusions.
LOCATION:GR06/07\, Faculty of English\, 9 West Rd (Sidgwick Site)
END:VEVENT
END:VCALENDAR
