Detecting Semantic Change Using LDA in Historical Texts: a Case Study on Dutch
- 👤 Speaker: Simon Hengchen (Université libre de Bruxelles)
- 📅 Date & Time: Tuesday 10 October 2017, 13:00 - 14:00
- 📍 Venue: SR-24, English Faculty Building, 9 West Road (Sidgwick Site)
Abstract
Semantic change detection is relevant to many, including historians who want to better understand their sources, or lexicographers who wish to compile dictionaries. While the traditional way of detecting semantic change is to “read a lot” (Cavallin 2012), the availability of large diachronic corpora in digital form and computing power allow for a more automatic and efficient way to tackle this task. This talk is in two parts: first, an LDA -based method to detect semantic change in historical, dirty text will be presented, and then a case study will illustrate the approach. In our case study, we demonstrate a language-agnostic method on a corpus of badly-OCRed Belgian socialist newspapers in Dutch from the 19th and 20th centuries. This case study thus hints at the reproducibility of the method on other, less-resourced languages.
Cavallin, K. (2012). Automatic extraction of potential examples of semantic change using lexical sets. In KONVENS, pages 370–377
Series This talk is part of the Language Technology Lab Seminars series.
Included in Lists
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- Guy Emerson's list
- Interested Talks
- Language Sciences for Graduate Students
- Language Technology Lab Seminars
- ndk22's list
- ob366-ai4er
- rp587
- Simon Baker's List
- SR-24, English Faculty Building, 9 West Road (Sidgwick Site)
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Tuesday 10 October 2017, 13:00-14:00