A system and method enabling acoustic barge-in during a voice prompt in a communication system. An acoustic prompt model is trained to represent the system prompt using the specific speech signal of the prompt. The acoustic prompt model is utilized in a speech recognizer in parallel with the recognizer's active vocabulary words to suppress the echo of the prompt within the recognizer. The speech recognizer may also use a silence model and traditional garbage models such as noise models and out-of-vocabulary word models to reduce the likelihood that noises and out-of-vocabulary words in the user utterance will be mapped erroneously onto active vocabulary words.