Patent attributes
For optimizing a partition of a data block into matching and non-matching segments in data deduplication using a processor device in a computing environment, a sequence of matching segments is split into sub-parts for obtaining a globally optimal subset, to which an optimal calculation is applied. The solutions of optimal calculations for the entire range of the sequence are combined, and a globally optimal subset is built by means of a first two-dimensional table represented by a matrix C[i,j], and storing a representation of the globally optimal subset in a second two-dimensional table represented by a matrix PS[i,j] that holds, at entry [i,j] of the matrix, the globally optimal subset for a plurality of parameters in form of a bit-string of length j−i+1, wherein i and j are indices of bit positions corresponding to segments.