Patent 10545978 was granted and assigned to Trifacta on January, 2020 by the United States Patent and Trademark Office.
A data preprocessing system builds transformation scripts for preprocessing datasets for processing by a data analysis system. The data preprocessing system presents various representations of data of a dataset including visual representations, textual representations, or structural representations. The data preprocessing system receives selections of attributes or values based on these representations. The data preprocessing system generates recommendations of transformations based on the attributes or values selected. The data preprocessing system builds a transformation script based on the recommendations of the transformations. The transformation script can be used for preprocessing the dataset for analysis by a data analysis system.