Log in
Enquire now
‌

US Patent 11894014 Audio-visual speech separation

Patent 11894014 was granted and assigned to Google on February, 2024 by the United States Patent and Trademark Office.

OverviewStructured DataIssuesContributors

Contents

Is a
Patent
Patent
0

Patent attributes

Patent Applicant
Google
Google
0
Current Assignee
Google
Google
0
Patent Jurisdiction
United States Patent and Trademark Office
United States Patent and Trademark Office
0
Patent Number
118940140
Patent Inventor Names
William Freeman0
Kevin William Wilson0
Inbar Mosseri0
Avinatan Hassidim0
Michael Rubinstein0
Ariel Ephrat0
Oran Lang0
Tali Dekel0
Date of Patent
February 6, 2024
0
Patent Application Number
179510020
Date Filed
September 22, 2022
0
Patent Citations
‌
US Patent 8543402 Speaker segmentation in noisy conversational speech
0
Patent Primary Examiner
‌
Shaun Roberts
0
CPC Code
‌
G10L 25/57
0
‌
G10L 21/10
0
‌
G10L 17/18
0
Patent abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for audio-visual speech separation. A method includes: obtaining, for each frame in a stream of frames from a video in which faces of one or more speakers have been detected, a respective per-frame face embedding of the face of each speaker; processing, for each speaker, the per-frame face embeddings of the face of the speaker to generate visual features for the face of the speaker; obtaining a spectrogram of an audio soundtrack for the video; processing the spectrogram to generate an audio embedding for the audio soundtrack; combining the visual features for the one or more speakers and the audio embedding for the audio soundtrack to generate an audio-visual embedding for the video; determining a respective spectrogram mask for each of the one or more speakers; and determining a respective isolated speech spectrogram for each speaker.

Timeline

No Timeline data yet.

Further Resources

Title
Author
Link
Type
Date
No Further Resources data yet.

References

Find more entities like US Patent 11894014 Audio-visual speech separation

Use the Golden Query Tool to find similar entities by any field in the Knowledge Graph, including industry, location, and more.
Open Query Tool
Access by API
Golden Query Tool
Golden logo

Company

  • Home
  • Press & Media
  • Blog
  • Careers
  • WE'RE HIRING

Products

  • Knowledge Graph
  • Query Tool
  • Data Requests
  • Knowledge Storage
  • API
  • Pricing
  • Enterprise
  • ChatGPT Plugin

Legal

  • Terms of Service
  • Enterprise Terms of Service
  • Privacy Policy

Help

  • Help center
  • API Documentation
  • Contact Us
By using this site, you agree to our Terms of Service.