Log in
Enquire now
‌

US Patent 7792895 Efficient matrix multiplication on a parallel processing device

Patent 7792895 was granted and assigned to NVIDIA on September, 2010 by the United States Patent and Trademark Office.

OverviewStructured DataIssuesContributors

Contents

Is a
Patent
Patent

Patent attributes

Current Assignee
NVIDIA
NVIDIA
Patent Jurisdiction
United States Patent and Trademark Office
United States Patent and Trademark Office
Patent Number
7792895
Date of Patent
September 7, 2010
Patent Application Number
11454411
Date Filed
June 16, 2006
Patent Citations Received
‌
US Patent 12124847 Systems, methods, and apparatuses for tile transpose
0
0
‌
US Patent 12106100 Systems, methods, and apparatuses for matrix operations
0
‌
US Patent 12112167 Matrix data scatter and gather between rows and irregularly spaced memory locations
0
‌
US Patent 11669326 Systems, methods, and apparatuses for dot product operations
0
‌
US Patent 11675590 Systems and methods for performing instructions to transform matrices into row-interleaved format
0
‌
US Patent 11698787 Interruptible and restartable matrix multiplication instructions, processors, methods, and systems
0
‌
US Patent 11714642 Systems, methods, and apparatuses for tile store
0
...
Patent Primary Examiner
‌
Chat C Do
Patent abstract

The present invention enables efficient matrix multiplication operations on parallel processing devices. One embodiment is a method for mapping CTAs to result matrix tiles for matrix multiplication operations. Another embodiment is a second method for mapping CTAs to result tiles. Yet other embodiments are methods for mapping the individual threads of a CTA to the elements of a tile for result tile computations, source tile copy operations, and source tile copy and transpose operations. The present invention advantageously enables result matrix elements to be computed on a tile-by-tile basis using multiple CTAs executing concurrently on different streaming multiprocessors, enables source tiles to be copied to local memory to reduce the number accesses from the global memory when computing a result tile, and enables coalesced read operations from the global memory as well as write operations to the local memory without bank conflicts.

Timeline

No Timeline data yet.

Further Resources

Title
Author
Link
Type
Date
No Further Resources data yet.

References

Find more entities like US Patent 7792895 Efficient matrix multiplication on a parallel processing device

Use the Golden Query Tool to find similar entities by any field in the Knowledge Graph, including industry, location, and more.
Open Query Tool
Access by API
Golden Query Tool
Golden logo

Company

  • Home
  • Press & Media
  • Blog
  • Careers
  • WE'RE HIRING

Products

  • Knowledge Graph
  • Query Tool
  • Data Requests
  • Knowledge Storage
  • API
  • Pricing
  • Enterprise
  • ChatGPT Plugin

Legal

  • Terms of Service
  • Enterprise Terms of Service
  • Privacy Policy

Help

  • Help center
  • API Documentation
  • Contact Us
By using this site, you agree to our Terms of Service.