Patent attributes
A distributed computing system has a plurality of computers each having a respective image that is defined by a pairing of an operating system and a respective collection of associated software packages. Original data associated with the computers includes, for each computer, an operating system identification and an identification of a collection of software packages where the original data is stored as a plurality of records for each computer. An apparatus for analysis of the images and for image distribution planning includes image identification logic configured to compress the original data into a respective, single record for each computer, providing for efficient and scalable processing. The image identification logic is configured to identify the number of images associated with the computers that are distinct. The apparatus further includes image reducing logic that is configured to reduce the number of distinct images through manual and automatic retargeting and deprovisioning.