POSITION-BASED TEXT-TO-SPEECH MODEL

Number of patents in Portfolio can not be more than 2000

United States of America

APP PUB NO 20250095631A1
SERIAL NO

18528116

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

Position-based text-to-speech model and training techniques are described. A digital document, for instance, is received by an audio synthesis service. A text-to-speech model is utilized by the audio synthesis service to generate digital audio from text included in the digital document. The text-to-speech model, for instance, is configured to generate a text encoding and a document positional encoding from an initial text sequence of the digital document. The document positional encoding is based on a location of the text encoding within the digital document. Digital audio is then generated by the text-to-speech model that includes a spectrogram having a reordered text sequence, which is different from the initial text sequence, by decoding the text encoding and the document positional encoding.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

Patent OwnerAddress
ADOBE INC345 PARK AVENUE SAN JOSE CA 95110-2704

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Dernoncourt, Franck Spokane, US 89 461
Gu, Jiuxiang Baltimore, US 30 57
Jain, Rajiv Bhawanji Falls Church, US 9 4
Manocha, Dinesh Bethesda, US 19 262
Mathur, Puneet Sunnyvale, US 5 2
Morariu, Vlad Ion Potomac, US 21 88
Nenkova, Ani Philadelphia, US 14 110
Tran, Quan Hung San Jose, US 13 87

Cited Art Landscape

Load Citation

Patent Citation Ranking

Forward Cite Landscape

Load Citation