Improving Multiclass Text Classification with Error-Correcting Output Coding and Sub-class Partitions
- đ¤ Speaker: Stuart Moore
- đ Date & Time: Monday 17 January 2011, 12:30 - 13:30
- đ Venue: GS15, Computer Laboratory
Abstract
Stuart will present the following paper:
Improving Multiclass Text Classification with Error-Correcting Output Coding and Sub-class Partitions. 2010. Baoli Li and Carl Vogel.
http://www.springerlink.com/content/d546042162276641/
Error-Correcting Output Coding (ECOC) is a general framework for multiclass text classification with a set of binary classifiers. It can not only help a binary classifier solve multi-class classification problems, but also boost the performance of a multi-class classifier. When building each individual binary classifier in ECOC , multiple classes are randomly grouped into two disjoint groups: positive and negative. However, when training such a binary classifier, sub-class distribution within positive and negative classes is neglected. Utilizing this information is expected to improve a binary classifier. We thus design a simple binary classification strategy via multi-class categorization (2vM) to make use of sub-class partition information, which can lead to better performance over the traditional binary classification. The proposed binary classification strategy is then applied to enhance ECOC . Experiments on document categorization and question classification show its effectiveness.
Anyone interested in more background material might want to look at http://arxiv.org/abs/cs/9501101 which was the original paper introducing this method.
Series This talk is part of the Natural Language Processing Reading Group series.
Included in Lists
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- GS15, Computer Laboratory
- Guy Emerson's list
- Natural Language Processing Reading Group
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Monday 17 January 2011, 12:30-13:30