Domain Informational Vocabulary Extraction (DIVE), aims to extract entity and key informational words from domain specific document collections to form a curated knowledge base, crosslinked with other well known ontologies, useful to domain researchers and curators at large. The system implements multiple strategies for biological entity detection, including using regular expression rules, ontologies, and keyword dictionaries. It also provides authorized users with a web interface where authors can make additional annotations and corrections to the extracted results. The manual updates are then used to improve entity detection in subsequent processed documents.

Publications and Presentations


Enhancing Information Accessibility of Publications with Text Mining and Ontology

Weijia Xu, Amit Gupta, Pankaj Jaiswal, Crispin Taylor, Patti Lockhart


International Conference on Biomedical Ontology and BioCreative (ICBO BioCreative 2016)



A Web Application for Extracting Key Domain Information for Scientific Publications using Ontology

Weijia Xu, Amit Gupta, Pankaj Jaiswal, Crispin Taylor, Patti Lockhart


International Conference on Biomedical Ontology and BioCreative (ICBO BioCreative 2016)