An electronic device and method for content-aware image encoding using a machine learning (ML) model are provided. The electronic device receives at least one foreground region and at least one background region from a first image frame. The electronic device determines a set of first macroblocks associated with the detected at least one foreground region and a set of second macroblocks associated with the detected at least one background region, determines a bit allocation control parameter associated with the determined set of second macroblocks, updates the determined bit allocation control parameter based on an application of a first trained ML model, and encodes the first image frame based on the updated bit allocation control parameter to obtain a second image frame so that a first image quality index associated with the first image frame matches a second image quality index associated with the second image frame within a threshold range.