Aspects herein describe methods and systems of receiving, by one or more cameras, images in which the images comprise facial images of individuals. Aspects of the disclosure describe extracting the facial images from the images received, sorting the extracted facial images into separate groups wherein each group corresponds to the facial images of each individual, and selecting, for each individual, a preferred facial image from each group. The preferred facial images selected are transmitted to a client for display. Aspects of the disclosure also describe selecting either a facial recognition algorithm or an audio triangulation algorithm to use to determine which individual is speaking wherein the selection is based on whether lip movement of one or more of the individuals is visible in the images received from the cameras.