Embodiment for identifying data convergence are presented. First and second sets of data each comprising heterogeneous data are each processed in accordance with a data clustering algorithm so as to obtain a plurality of primary and secondary data clusters, respectively, where each data cluster comprising homogeneous data. The primary and secondary data clusters are analyzed with respect to time to identify convergence of data of the first and second sets of data to first and second topics, respectively. The first and second topics are compared to determine a pattern of data convergence for the first and second data sets.