Systems, methods and computer program products to perform multi-modality vertebrae recognition are provided. Aspects of the present disclosure disclosed and described herein enable a multi-modality vertebrae recognition engine that recognizes vertebrae by using a three-stage recognition approach: landmark detection, global shape registration, and local pose adjustment. These stages cover the matching from local to global spine structures. The three stages are implemented by three modules in the hierarchical deformable model, the local appearance module, global geometry model, and local geometry model. According to one aspect of the present disclosure, the overall workflow can be understood as a three-stage top-down registration. The goal of the registration is: 1) to align the global shape of the spine model, and 2) to align the vertebrae poses with the local image structures around identified landmarks.