SYSTEM AND METHOD FOR EFFICIENT TEXT-GUIDED GENERATION OF HIGH-RESOLUTION VIDEOS

Number of patents in Portfolio can not be more than 2000

United States of America

APP PUB NO 20250111552A1
SERIAL NO

18819064

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

Systems and methods are disclosed that train a content frame-motion latent diffusion model (CDM) and use the CDM to generate requested videos. The CMD may be a two-stage framework that first compresses videos to a succinct latent space and then learns the video distribution in this latent space. For instance, the CMD may include an autoencoder and two diffusion models. In a first stage, using the autoencoder, a low-dimensional latent decomposition into a content frame and latent motion representation is learned. In the second stage, without adding any new parameters, the content frame distribution may be fine-tuned by using a pretrained image diffusion model, which allows the CMD to leverage the rich visual knowledge in pretrained image diffusion models. In addition, a new lightweight diffusion model may be used to generate motion latent representations that are conditioned on the given content frame.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

Patent OwnerAddress
NVIDIA CORPORATION2788 SAN TOMAS EXPRESSWAY SANTA CLARA CA 95051

International Classification(s)

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Anandkumar, Animashree Pasadena, US 18 106
Huang, De-An Cupertino, US 9 10
Li, Boyi Berkeley, US 5 1
Nie, Weili Sunnyvale, US 14 21
Yu, Sihyun Gyeonggi-do, KR 1 0

Cited Art Landscape

Load Citation

Patent Citation Ranking

Forward Cite Landscape

Load Citation