A method includes extracting, using a backbone of a machine learning model, a plurality of features from an image of a foot and predicting, using a first portion of the machine learning model and based on one or more features of the plurality of features, a first aspect of the foot. The method also includes predicting, using a second portion of the machine learning model and based on one or more features of the plurality of features, a second aspect of the foot different from the first aspect, generating, using at least the first aspect and the second aspect, a two-dimensional model of a shoe, and superimposing the two-dimensional model of the shoe onto the image of the foot.