Patent attributes
Frame sequences from multiple image sensors may be combined in order to form, for example, an interleaved frame sequence. Individual frames of the combined sequence may be configured a by combination (e.g., concatenation) of frames from one or more source sequences. The interleaved/concatenated frame sequence may be encoded using a motion estimation encoder. Output of the video encoder may be processed (e.g., parsed) in order to extract motion information present in the encoded video. The motion information may be utilized in order to determine a depth of visual scene, such as by using binocular disparity between two or more images by an adaptive controller in order to detect one or more objects salient to a given task. In one variant, depth information is utilized during control and operation of mobile robotic devices.