University of Cambridge > Talks.cam > Centre for Molecular Science Informatics > Using Ontology to Classify Members of a Protein Family

Using Ontology to Classify Members of a Protein Family

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact jbom1.

In this talk, I will describe work on using ontologies to help classify members of the protein phosphatases in a genome. Classification of proteins expressed by an organism is an important step in understanding the molecular biology of that organism. Traditionally, this classification has been done by human experts and it is regarded as the gold standard method. Human knowledge can recognise the properties that are sufficient to place an individual gene product into a particular protein family group. Automation of this task usually fails to meet this gold standard because of the difficult recognition stage. The need to automate the classification process by making human knowledge accessible in computational form is motivated by the growing number of genomes, the rapid changes in knowledge and the central role of classification in the annotation process. We capture human understanding of how to recognise members of the protein phosphatase family by domain architecture as an ontology. By describing protein instances in terms of the domains they contain, it is possible to use description logic reasoners and our ontology to assign those proteins to a protein family class.

We have tested our system on classifying the protein phosphatases of the human and Aspergillus fumigatus genomes and found that our knowledge-based, automatic classification matches that of the human curators and for these two species we have also found putative new phosphatase proteins. We have extended this method to survey three parasite genomes. We have made the classification process fast and reproducible and, where appropriate knowledge is available, the method can potentially be generalised for use with any protein family.

This talk is part of the Centre for Molecular Science Informatics series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2021 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity