System and Method for Audio Processing using Time-Invariant Speaker Embeddings

Number of patents in Portfolio can not be more than 2000

United States of America Patent

APP PUB NO 20240304205A1
SERIAL NO

18224659

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

A system and method for sound processing for performing multi-talker conversation analysis is provided. The sound processing system includes a deep neural network trained for processing audio segments of an audio mixture of the multi-talker conversation. The deep neural network includes a speaker-independent layer that produces a speaker-independent output, and a speaker-biased layer applied once independently to each of the audio segments for each multiple speakers of the audio mixture. The deep neural network also processes a time-invariant embedding by individually assigning each application of the speaker-biased layer to a corresponding speaker by inputting the corresponding time-invariant speaker embedding. The deep neural network thus produces data indicative of time-frequency activity regions of each speaker of the multiple speakers in the audio mixture from a combination of speaker-biased outputs.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

Patent OwnerAddress
MITSUBISHI ELECTRIC RESEARCH LABORATORIES INCCAMBRIDGE MA

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Böddeker, Christoph Paderborn, DK 1 0
Le, Roux Jonathan Arlington, US 56 568
Subramanian, Aswin Shanmugam Everett, US 4 0
Wichern, Gordon Boston, US 23 70

Cited Art Landscape

Load Citation

Patent Citation Ranking

Forward Cite Landscape

Load Citation