Home
Perceiver: General Perception with Iterative Attention (Google DeepMind Research Paper Explained)
Yannic Kilcher
Mar 22, 2021
55,610 views
Fastformer: Additive Attention Can Be All You Need (Machine Learning Research Paper Explained)
MLP-Mixer: An all-MLP Architecture for Vision (Machine Learning Research Paper Explained)
Stanford CS25: V1 I DeepMind's Perceiver and Perceiver IO: new data family architecture
LambdaNetworks: Modeling long-range Interactions without Attention (Paper Explained)
Decoder-Only Transformers, ChatGPTs specific Transformer, Clearly Explained!!!
Decision Transformer: Reinforcement Learning via Sequence Modeling (Research Paper Explained)
Rethinking Attention with Performers (Paper Explained)
Attention in transformers, visually explained | Chapter 6, Deep Learning
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (Paper Explained)
What are Transformer Models and how do they work?
FNet: Mixing Tokens with Fourier Transforms (Machine Learning Research Paper Explained)
DDPM - Diffusion Models Beat GANs on Image Synthesis (Machine Learning Research Paper Explained)
∞-former: Infinite Memory Transformer (aka Infty-Former / Infinity-Former, Research Paper Explained)
Cross Attention | Method Explanation | Math Explained
Pretrained Transformers as Universal Computation Engines (Machine Learning Research Paper Explained)
FlashAttention - Tri Dao | Stanford MLSys #67
DeepMind Flamingo explained - 32 images are enough
DINO: Emerging Properties in Self-Supervised Vision Transformers (Facebook AI Research Explained)
Swin Transformer paper animated and explained
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis (ML Research Paper Explained)