Patent attributes
Methods, systems, and apparatus for identifying labels for image collections are presented. In one aspect, a method includes obtaining a collection of images; obtaining, for each image in the collection of images, image similarity data that indicates a measure of similarity of the image to other images in the collection of images; generating, based on the similarity data, two or more image clusters from the collection of images, each image cluster including one or more images from the collection of images; for each image cluster: obtaining, for each image in the image cluster, a set of image labels; generating, from each set of image labels obtained for each image in the image cluster, a set of cluster labels; selecting one or more cluster labels from the set of cluster labels; and identifying the selected cluster labels as a set of collection labels for the collection of images.