Representative embodiments disclose mechanisms to map natural language input to an application programming interface (API) call. The natural language input is first mapped to an API frame, which is a representation of the API call without any API call formatting. The mapping from natural language input to API frame is performed using a trained sequence to sequence neural model. The sequence to sequence neural model is decomposed into small prediction units called modules. Each module is highly specialized at predicting a pre-defined kind of sequence output. The output of the modules can be displayed in an interactive user interface that allows the user to add, remove, and/or modify the output of the individual modules. The user input can be used as further training data. The API frame is mapped to an API call using a deterministic mapping.