Patent 10938838 was granted and assigned to Sophos Group PLC on March, 2021 by the United States Patent and Trademark Office.
An automated system attempts to characterize code as safe or unsafe. For intermediate code samples not placed with sufficient confidence in either category, human-readable analysis is automatically generated to assist a human reviewer in reaching a final disposition. For example, a random forest over human-interpretable features may be created and used to identify suspicious features in a manner that is understandable to, and actionable by, a human reviewer. Similarly, a k-nearest neighbor algorithm may be used to identify similar samples of known safe and unsafe code based on a model for, e.g., a file path, a URL, an executable, and so forth. Similar code may then be displayed (with other information) to a user for evaluation in a user interface. This comparative information can improve the speed and accuracy of human interventions by providing richer context for human review of potential threats.