A method for guiding a robot equipped with a camera to facilitate three-dimensional (3D) reconstruction through sampling based planning includes recognizing and localizing an object in a two-dimensional (2D) image. The method also includes computing 3D depth maps for the localized object. A 3D object map is constructed from the depth maps. A sampling based structure is grown around the 3D object map and a cost is assigned to each edge of the sampling based structure. The sampling based structure may be searched to determine a lowest cost sequence of edges that may, in turn be used to guide the robot.