An apparatus includes a memory and processor. The memory stores a set of object categories and a set of motion categories. The processor splits a video into an ordered series of frames. For each frame, the processor determines that the frame includes an image of an object of a given object category. The processor assigns the given object category to the frame and stores the assigned object category in an ordered series of object category assignments. The processor determines, based on a subset of the ordered series of object category assignments, that the video used to generate the ordered series of object category assignments depicts a motion of a given motion category. The processor assigns the given motion category to the video.