A system for optimizing deployment of a machine learning workload is provided. A computer device receives information pertaining to a machine learning workload to be processed for a client device. The computer device determines a machine learning model for the workload and a processing location for the workload based, at least in part, on the information. The computer device generates a request to process the workload at the determined processing location utilizing the determined machine learning model.