BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Extracting and Querying Probabilistic Information in BayesStore - 
 Daisy Zhe Wang\, University of California\, Berkeley
DTSTART:20110412T094000Z
DTEND:20110412T104000Z
UID:TALK30665@talks.cam.ac.uk
CONTACT:Microsoft Research Cambridge Talks Admins
DESCRIPTION:In the past few years\, the number of applications that need t
 o process large-scale data has grown remarkably. The data driving these ap
 plications is often uncertain\, as is the analysis\, which often involves 
 probabilistic modeling and inference. Examples include sensor-based monito
 ring\, information extraction and online advertising. Prior to our work\, 
 probabilistic database research advocated an approach in which uncertainty
  is modeled by attaching probabilities to data items. However\, such syste
 ms do not and cannot take advantage of the wealth of Statistical Machine L
 earning (SML) research\, because they are unable to represent and reason a
 bout the pervasive probabilistic correlations in the data. \n\nIn my thesi
 s\, I proposed\, built\, and evaluated BayesStore\, a probabilistic databa
 se system that natively supports SML models and various inference algorith
 ms to perform advanced data analysis. This marriage of database and SML te
 chnologies creates a declarative and efficient probabilistic processing fr
 amework for applications dealing with large-scale uncertain data. I have e
 xplored a variety of research challenges\, including extending the databas
 e data model with probabilistic data and statistical models\, defining rel
 ational operators (e.g.\, select\, project\, join) over probabilistic data
  and models\, developing joint optimization of inference operators and the
  relational algebra\, and devising novel query execution plans. I used inf
 ormation extraction over text as the driving application. My research show
 s that using in-database SML methods to extract and query probabilistic in
 formation can significantly improve answer quality. Moreover\, it shows th
 at optimizations for query-driven SML inference lead to orders-of-magnitud
 e speed-up on large corpora. \n\n
LOCATION:Small lecture theatre\, Microsoft Research Ltd\, 7 J J Thomson Av
 enue (Off Madingley Road)\, Cambridge
END:VEVENT
END:VCALENDAR
