Non-parametric Bayesian Method and Maximum-A-Posteriori Inference in Statistical Machine Translation
- đ¤ Speaker: Tsuyoshi Okita (Dublin City University)
- đ Date & Time: Wednesday 02 May 2012, 11:15 - 12:15
- đ Venue: Engineering Department, CBL Room BE-438
Abstract
Since recent sophisticated Machine Learning algorithms implicitly handle various things, practitioners do not need to worry much about how to deploy those algorithms in particular situations. However, if it comes to real-life data such as Statistical Machine Translation, several things were worth considering: 1) the underlying distribution may be better assumed to be the power-law distribution rather than its i.i.d. counterpart, 2) noise may not be captured well as a simple Gaussian type (hence, such noise assumption is not often embedded in the ML algorithm), 3) available prior knowledge may not be sufficiently used, and so forth. It is noted that what kinds of non-Gaussian type noise we need to focus on and what kind of prior knowledge we need to target were not evident from the beginning (These issues would be quite difficult even if we can exploit the domain experts. This is since these require both the knowledge of the underlying ML algorithm and the domain knowledge of the area). We discuss two algorithms in the application area of Statistical Machine Translation: non-parametric Bayesian method (hierarchical Pitman-Yor process related topics) and Maximum-A-Posteriori inference. The first algorithm is related to the language model smoothing where 1) is concerned, while the second algorithm is related to the word alignment where 2) and 3) are concerned.
Series This talk is part of the Machine Learning @ CUED series.
Included in Lists
- All Talks (aka the CURE list)
- Biology
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge Neuroscience Seminars
- Cambridge talks
- CBL important
- Chris Davis' list
- Creating transparent intact animal organs for high-resolution 3D deep-tissue imaging
- dh539
- dh539
- Engineering Department, CBL Room BE-438
- Featured lists
- Guy Emerson's list
- Hanchen DaDaDash
- Inference Group Summary
- Information Engineering Division seminar list
- Interested Talks
- Joint Machine Learning Seminars
- Life Science
- Life Sciences
- Machine Learning @ CUED
- Machine Learning Summary
- ML
- ndk22's list
- Neuroscience
- Neuroscience Seminars
- Neuroscience Seminars
- ob366-ai4er
- Required lists for MLG
- rp587
- Seminar
- Simon Baker's List
- Stem Cells & Regenerative Medicine
- Trust & Technology Initiative - interesting events
- yk373's list
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Wednesday 02 May 2012, 11:15-12:15