In some examples, a software agent executing on a server receives a communication comprising a first utterance from a customer and predicts, using an intent classifier, a first intent of the first utterance. Based on determining that the first intent is order-related, the software agent predicts, using a dish classifier, a cart delta vector based at least in part on the first utterance and modifies a cart associated with the customer based on the cart delta vector. The software agent predicts, using a dialog model, a first dialog response based at least in part on the first utterance and provides the first dialog response to the customer using a text-to-speech converter.