Patent 10482174 was granted and assigned to Capital One on November, 2019 by the United States Patent and Trademark Office.
The present disclosure relates to systems and methods for generating synthetic documents. In one implementation, a system for generating synthetic data from a plurality of documents may include at least one processor and at least one non-transitory memory storing instructions that, when executed by the at least one processor cause the system to: receive a plurality of documents, individual documents of the plurality of documents having a same document type; generate a distribution of values for a corresponding pixel in the individual documents of plurality of documents; determine, based on the distributions, one or more common features of the plurality of documents; determine, based on the comparison, one or more input fields; generate a template including the one or more common features and the one or more input fields; and input synthetic data into the one or more input fields of the template thereby generating a plurality of synthetic documents.