Injecting Text in Self-Supervised Speech Pre-training

Number of patents in Portfolio can not be more than 2000

United States of America

APP PUB NO 20250078807A1
SERIAL NO

18951572

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

A method includes receiving training data that includes unspoken text utterances and un-transcribed non-synthetic speech utterances. Each unspoken text utterance is not paired with any corresponding spoken utterance of non-synthetic speech. Each un-transcribed non-synthetic speech utterance is not paired with a corresponding transcription. The method also includes generating a corresponding synthetic speech representation for each unspoken textual utterance of the received training data using a text-to-speech model. The method also includes pre-training an audio encoder on the synthetic speech representations generated for the unspoken textual utterances and the un-transcribed non-synthetic speech utterances to teach the audio encoder to jointly learn shared speech and text representations.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

Patent OwnerAddress
GOOGLE LLC1600 AMPHITHEATRE PARKWAY MOUNTAIN VIEW CA 94043

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Chen, Zhehuai Jersey City, US 21 26
Mengibar, Pedro J Moreno Jersey City, US 65 1165
Ramabhadran, Bhuvana Mt. Kisco, US 125 2541
Rosenberg, Andrew M Brooklyn, US 20 52
Zhang, Yu Mountain View, US 1971 9206

Cited Art Landscape

Load Citation

Patent Citation Ranking

Forward Cite Landscape

Load Citation