Delivery of audio content is tailored to individual users. A viewing direction of a user to a display presenting a video stream showing a scene of an environment is determined. A physical location, in the environment, that the user is viewing in the scene is determined, and an audio stream, of several audio streams obtained from different physical locations in the environment, is identified that correlates to the determined physical location that the user is viewing. The identified audio stream is then provided to the user. Additional aspects include identifying potentially interesting areas from which audio streams are obtained and selectively triggered by users for provision to the users. Further aspects include an ability for a user to identify for other users a particular area that the user views to obtain interesting audio, informing the other users that the user is listening to interesting audio associated with that particular area.