Patent 11803935 was granted and assigned to Intel on October, 2023 by the United States Patent and Trademark Office.
Techniques to improve performance of matrix multiply operations are described in which a compute kernel can specify one or more element-wise operations to perform on output of the compute kernel before the output is transferred to higher levels of a processor memory hierarchy.