Various embodiments provide systems, methods, devices, and instructions for performing simultaneous localization and mapping (SLAM) that involve initializing a SLAM process using images from as few as two different poses of a camera within a physical environment. Some embodiments may achieve this by disregarding errors in matching corresponding features depicted in image frames captured by an image sensor of a mobile computing device, and by updating the SLAM process in a way that causes the minimization process to converge to global minima rather than fall into a local minimum.