A SBIR Phase II contract was awarded to Language Computer Corporation in August, 2018 for $1,499,903.0 USD from the U.S. Department of Defense and DARPA.
A significant challenge in extracting event attributes from unstructured text is that events possess a wide array of attributes which affect decision making. In a single document, an author may describe events that have happened (realis), did not happen or could have happened (irrealis), and that may happen in the future. They may describe events which reflect a generic class of occurrences, or a specific event at a known place and time. Author use of event attributes can also be used to signal group dynamics, motivations, and relationships. In this DARPA Phase II SBIR effort we seek to develop a novel system for understanding, extracting, and conveying events and their relevant event attribute information to analysts and software tools to facilitate their complete understanding of the events within documents of interest. We propose to build on our prototype developed previously under this SBIR (for AFRL) with the goal of extending the state-of-the-art in both the quality of the event attributes extracted, as well as in the types of attributes which can be extracted. Our goal will be to not just extract these attributes, but to understand how the attributes interact and how this information can best be searched and conveyed to analysts. The attributes extracted include the genericity (specific/generic), the realis/irrealis, and the factuality of the events. We will utilize a variety of models, including Long Short-Term Memory (LSTM), self-attention networks, and probabilistic inference. In the proposed option, we will utilize linguistic signals to estimate when future events might occur, and to analyze group dynamics based on author use of event attributes.