We explore a new class of diffusion models based on the transformer architecture. We train latent diffusion models of images, replacing the commonly-used U-Net backbone with a transformer that ...
Abstract: Transformer, an attention-based encoder–decoder model, has already revolutionized the field of natural language processing (NLP). Inspired by such significant achievements, some pioneering ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results