NLIP Reading Group: Unsupervised Decomposition of a Document into Authorial Components
- 👤 Speaker: Thomas Lippincott (University of Cambridge)
- 📅 Date & Time: Thursday 13 October 2011, 12:00 - 13:00
- 📍 Venue: GS15, Computer Laboratory
Abstract
Tom will be presenting
@inproceedings{koppel2011unsupervised , author = {Koppel, M. and Akiva, N. and Dershowitz, I. and Dershowitz, N.} , title = {Unsupervised decomposition of a document into authorial components} , booktitle = {Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1} , year = {2011} , pages = {1356—1364} , organization = {Association for Computational Linguistics} }
Abstract We propose a novel unsupervised method for separating out distinct authorial components of a document. In particular, we show that, given a book artificially “munged” from two thematically similar biblical books, we can separate out the two constituent books almost perfectly. This allows us to automatically recapitulate many conclusions reached by Bible scholars over centuries of research. One of the key elements of our method is exploitation of differences in synonym choice by different authors.
Series This talk is part of the Natural Language Processing Reading Group series.
Included in Lists
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- GS15, Computer Laboratory
- Guy Emerson's list
- Natural Language Processing Reading Group
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Thursday 13 October 2011, 12:00-13:00