Domain Informational Vocabulary Extraction (DIVE), aims to extract entity and key informational words from domain specific document collections to form a
curated knowledge base, crosslinked with other well known ontologies, useful to domain researchers and curators at large. The system implements multiple
strategies for biological entity detection, including using regular expression rules, ontologies, and keyword dictionaries. It also provides authorized users
with a web interface where authors can make additional annotations and corrections to the extracted results. The manual updates are then used to improve entity
detection in subsequent processed documents.