MULTI-SPEAKER DATA AUGMENTATION FOR IMPROVED END-TO-END AUTOMATIC SPEECH RECOGNITION

Number of patents in Portfolio can not be more than 2000

United States of America Patent

APP PUB NO 20240331684A1
SERIAL NO

18129328

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

Features of two or more single speaker utterances are concatenated together and corresponding labels of the two or more single speaker utterances are concatenated together. Single speaker acoustic embeddings for each of the single speaker utterances of the concatenated single speaker utterances are generated using a single speaker teacher encoder network. An enhanced model is trained on the concatenated single speaker utterances using a classification loss LCLASS and a representation similarity loss LREP, the representation similarity loss LREP defined to influence an embedding derived from the concatenated single speaker utterances, the influence being based on the single speaker acoustic embeddings derived from the single speaker teacher encoder network.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

Patent OwnerAddress
INTERNATIONAL BUSINESS MACHINES CORPORATIONNEW ORCHARD ROAD ARMONK NY 10504

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Kingsbury, Brian E D Cortlandt Manor, US 34 572
Kuo, Hong-Kwang Pleasantville, US 18 409
Saon, George Andrei Stamford, US 19 112
Thomas, Samuel White Plains, US 50 241

Cited Art Landscape

Load Citation

Patent Citation Ranking

Forward Cite Landscape

Load Citation