Retrieval of images of objects from a large-scale database of object images, based on a query image. The database may, for example, contain images of objects such as faces, vehicles, people and luggage. Semantic attributes such as doors or windows in the case of vehicles are used as high level semantic cues to determine identities of objects in the images. Salient visual characteristics of the images are labeled with attribute information, and a transformation is learned so as to transform the labeled visual characteristics into a discrimination vector that discriminates between the labels. A similarity metric is learned using the discrimination vectors, such that different images depicting the same object are determined to be close while those having different objects are determined to be far apart. Candidates are retrieved based on a query image, and a re-ranking step may be applied to improve results. Validation experiments are described.