Mining scientific diagrams for semantic information
- đ¤ Speaker: Dr Peter Murray-Rust
- đ Date & Time: Wednesday 27 January 2016, 14:00 - 15:00
- đ Venue: MR4, Centre for Mathematical Sciences, Wilberforce Road, Cambridge
Abstract
Scientific data is often only reported as diagrams in publications and is effectively destroyed and lost. This data is often critically valuable for other scientists and data abstracting services, and often has to be recreated manually from the diagram at great expense, with waste and error. Examples include plots, charts, and more complex objects such as chemical structure diagrams and phylogenetic (evolutionary) trees.
I shall show how, in favourable circumstances, it is possible to recreate semantic information from diagrams using well-established Computer Vision techniques. These include thresholding, binarization, dilation and thinning, OCR and a variety of domain-specific heuristics. Our Open Source library is based on BoofCV , an Open Java Image processing library, and enhanced with tools useful for scientific documents. Some PDF documents contain vector images and are particularly tractable while others are only pixel images and suffer form overlap, problems of scale and loss of detail
I shall show the application to chemistry and phylogenetics and show where errors and loss occur.
http://www.slideshare.net/petermurrayrust/mining-scientific-images
Series This talk is part of the Computational and Systems Biology series.
Included in Lists
- All CMS Events
- All Talks (aka the CURE list)
- Biology
- CamBridgeSens
- Cambridge talks
- Computational and Systems Biology
- custom
- Graduate-Seminars
- Life Science Interface Seminars
- Life Sciences
- Life Sciences
- ME Seminar
- MR4, Centre for Mathematical Sciences, Wilberforce Road, Cambridge
- my_list
- other talks
- PMRFPS's
- School of Physical Sciences
- se393's list
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Dr Peter Murray-Rust
Wednesday 27 January 2016, 14:00-15:00