Natural Language Processing Type and Annotation Browser

NLP-TAB is a web-based system designed to allow researchers and developers of Natural Language Processing (NLP) systems to compare the output of several disparate NLP systems to each other or to a manually created reference standard. The comparison is performed by running the NLP systems on a single corpus of text with subsequent statistical analysis of co-occurrences between annotations generated by NLP systems. Analysis results are stored and indexed using the ElasticSearch technology and displayed to end users with a custom web-based interface.

System Description

Documents

The Documents section allows for the exploration of the documents run through each of the analyzed systems. You can filter to find specific text in documents, below are a few examples on our demo server:

Type System Analysis

Type system analysis performs the comparison between annotation types generated by the different NLP systems by first counting how often pairs of annotations from different NLP systems cover approximately the same text and how often they cover completely different text. This co-occurence information is used to generate 2X2 tables for all pairs of annotation types in order to calculate the degree of dependence between annotation types using common metrics which at present include the F-score, Jaccard and Matthews coefficients. Pairs of annotation types with higher scores are more likely to be functionally equivalent.

Type Systems

The type systems screen allows for users to explore the type systems that have been uploaded to the system, browsing the individual types in each system. Information included on the type systems page:

Elasticsearch backend

NLP-TAB uses an Elasticsearch backend to store Common Annotation Structure (CAS) information produced by each NLP system being compared for each document in the collection. A read-only api to the backend is accessible at athena.ahc.umn.edu/elasticsearch For more information on elasticsearch, you can visit their website at elasticsearch.org.

Prerequisites

  1. An ElasticSearch server running version 2.1.0.
  2. JDK 1.8
  3. Maven.

Building

In order to build the NLP-TAB ElasticSearch plugin run the following command in the NLP-TAB project directory.

mvn clean package

This will build a ElasticSearch plugin zip file in target/releases, nlptab-{version}.zip. To install the plugin into your ElasticSearch server you can type:

bin/plugin install file:/path-to/target/releases/nlptab-{version}.zip

About Us

NLP-TAB is developed by the University of Minnesota Institute for Health Informatics NLP/IE Group and the Open Health NLP Consortium.

Other Resources

BioMedICUS

NLP-TAB

NLP/IE Group Resources

Acknowledgements

Funding for this work was provided by: