Methods and systems for training an intent classifier. For example, a question-intent tuple dataset comprising data samples is received. Each data sample has a question, an intent, and a task. A pre-trained language model is also received and fine-tuned by adjusting values of learnable parameters. Parameter adjustment is performed by generating a plurality of neural network models. Each neural network model is trained to predict at least one intent of the respective question having a same task value of the tasks of the question-intent tuple dataset. Each task represents a source of the question and the respective intent. The fine-tuned language model generates embeddings for training input data, the training input data comprising a plurality of data samples having questions and intents. Further, feature vectors for the data samples of the training input data are generated and used to train an intent classification model for predicting intents.