Patent attributes
Systems including physical devices, such as button and switches, that receive audio when a user performs a specific interaction are described. The audio may correspond to a particular spoken command to be executed by a system in a time-delayed fashion. At a later time, when another interaction is performed with the physical device, the device may send the stored audio to a server for processing to determine a command associated with the audio. A device may store multiple audio data segments corresponding to multiple different commands, and each piece of audio data corresponding to a command may be associated with a specific physical operation of a device. If audio data is determined as corresponding to a multiple-input command, additional information needed to perform the multiple-input command may be audibly gathered from a user.