A method includes obtaining, using at least one processor of an electronic device, a base model trained to perform natural language understanding. The method also includes generating, using the at least one processor, a first model expansion based on knowledge from the base model. The method further includes training, using the at least one processor, the first model expansion based on first utterances without modifying parameters of the base model. The method also includes receiving, using the at least one processor, an additional utterance from a user. In addition, the method includes determining, using the at least one processor, a meaning of the additional utterance using the base model and the first model expansion.