Systems and approaches are provided for robustly determining the motion of a computing device. Multiple cameras on the device can each capture a sequence of images, and the images can be analyzed to determine motion of the device with respect to a user, an object, or scenery captured in the images. The estimated motion may be complemented with measurements from an inertial sensor such as a gyroscope or an accelerometer to provide more accurate estimations of device motion than can be provided by image data or inertial sensor data alone. A computing device can then be configured to detect device motion as user input such as to navigate a user interface or to remotely control movement of another electronic device.