Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing automatic builds of snapshots using software packages with highly similar contents. One of the methods includes computing, by a source code analysis system, a respective similarity score between contents of a particular snapshot and contents of each software package of a plurality of software packages in one or more package repositories. A highest-scoring software package for the snapshot is determined using the computed similarity scores. An automatic build of the snapshot using the highest-scoring software package is performed, including identifying one or more dependencies and one or more build commands from the highest-scoring software package, installing the one or more dependencies in a build environment of the snapshot, and executing the one or more build commands in the build environment of the snapshot.