Home

Transformers explained | The architecture behind LLMs

AI Coffee Break with Letitia

Jan 21, 2024
24,762 views

Direct Preference Optimization: Your Language Model is Secretly a Reward Model | DPO paper explained

Direct Preference Optimization: Your Language Model is Secretly a Reward Model | DPO paper explained

Attention in transformers, visually explained | Chapter 6, Deep Learning

Attention in transformers, visually explained | Chapter 6, Deep Learning

Transformers for beginners | What are they and how do they work

Transformers for beginners | What are they and how do they work

The moment we stopped understanding AI [AlexNet]

The moment we stopped understanding AI [AlexNet]

MAMBA and State Space Models explained | SSM explained

MAMBA and State Space Models explained | SSM explained

Vision Transformer Basics

Vision Transformer Basics

How might LLMs store facts | Chapter 7, Deep Learning

How might LLMs store facts | Chapter 7, Deep Learning

The math behind Attention: Keys, Queries, and Values matrices

The math behind Attention: Keys, Queries, and Values matrices

A brief history of the Transformer architecture in NLP

A brief history of the Transformer architecture in NLP

But what is a GPT? Visual intro to transformers | Chapter 5, Deep Learning

But what is a GPT? Visual intro to transformers | Chapter 5, Deep Learning

Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - 693

Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - 693

The Most Important Algorithm in Machine Learning

The Most Important Algorithm in Machine Learning

Stanford CS25: V2 I Introduction to Transformers w/ Andrej Karpathy

Stanford CS25: V2 I Introduction to Transformers w/ Andrej Karpathy

Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution – Paper Explained

Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution – Paper Explained

Sparse LLMs at inference: 6x faster transformers! | DEJAVU paper explained

Sparse LLMs at inference: 6x faster transformers! | DEJAVU paper explained

Transformers, explained: Understand the model behind GPT, BERT, and T5

Transformers, explained: Understand the model behind GPT, BERT, and T5

Decoder-Only Transformers, ChatGPTs specific Transformer, Clearly Explained!!!

Decoder-Only Transformers, ChatGPTs specific Transformer, Clearly Explained!!!

Swin Transformer paper animated and explained

Swin Transformer paper animated and explained

Shocking SECRET Behind OpenAI o1 Model - Bans Anyone Who Dares Ask THIS!

Shocking SECRET Behind OpenAI o1 Model - Bans Anyone Who Dares Ask THIS!

Transformer Neural Networks, ChatGPT's foundation, Clearly Explained!!!

Transformer Neural Networks, ChatGPT's foundation, Clearly Explained!!!

Contact Us

© 2022. All rights reserved by Tojsiab