Patent attributes
A computer system generates a hierarchical evolutionary tree of digests of sample files. The digests are generated using a locality sensitive hashing function. The digests are grouped into several clusters, and the clusters are grouped into several nodes. The nodes are connected in hierarchical order to generate the hierarchical evolutionary tree. A digest of a file being evaluated for malware is generated using the locality sensitive hashing function. The digest is put in a cluster of the hierarchical evolutionary tree having digests that are most similar to the digest relative to digests of other clusters of the hierarchical evolutionary tree. The digest is identified to be of the same malware family as the digests of the cluster.