Patent 10572595 was granted and assigned to Baidu Usa on February, 2020 by the United States Patent and Trademark Office.
Presented herein are systems and methods for question answering (QA). In embodiments, extractive question answering (QA) is cast as an iterative search problem through the document's structure: select the answer's sentence, start word, and end word. This representation reduces the space of each search step and allows computation to be conditionally allocated to promising search paths. In embodiments, globally normalizing the decision process and back-propagating through beam search makes this representation viable and learning efficient. Various model embodiments, referred to as Globally Normalized Readers (GNR), achieve excellent performance. Also introduced are embodiments of data-augmentation to produce semantically valid examples by aligning named entities to a knowledge base and performing swaps new entities of the same type. This methodology also improved the performance of GNR models and is of independent interest for a variety of natural language processing (NLP) tasks.