copy and paste this google map to your website or blog!
Press copy button and paste into your blog or website.
(Please switch to 'HTML' mode when posting into your blog. Examples: WordPress Example, Blogger Example)
VDT: General-purpose Video Diffusion Transformers via Mask Modeling This work introduces Video Diffusion Transformer (VDT), which pioneers the use of transformers in diffusion-based video generation It features transformer blocks with modularized temporal and spatial attention modules to leverage the rich spatial-temporal representation inherited in transformers
VDT: G PURPOSE VIDEO DIFFUSION TRANS FORMERS VIA MODELING - OpenReview Our VDT showcases strong video generation potential and can seamlessly extend to and perform well on a broader array of video generation tasks through our unified spatial-temporal mask modeling mechanism, without requiring modifications to the underlying architecture