Disclosed is an augmented reality system to generate and cause display of an augmented reality interface at a client device. Various embodiments may detect speech, identify a source of the speech, transcribe the speech to a text string, generate a speech bubble based on properties of the speech and that includes a presentation of the text string, and cause display of the speech bubble at a location in the augmented reality interface based on the source of the speech.