Home
Reading 12: Self-Attention and Transformers
Self-Attention and Transformers
¶
Readings
•
Slides
•
FCA - Section 8.5.4
•
PML - Section 15.4
•
D2L - Sections 11.1 through 11.6
•
Attention (HdM)
Videos
•
Panopto
•
Harvard CS50 (35:40 - 54:15)
Transformers, the tech behind LLMs
Attention in transformers, step-by-step
•
Attention (StatQuest)
•
Matrix Math (StatQuest)
•
Transformer Neural Networks, ChatGPT's foundation, Clearly Explained!!! (Statquest)
Notebooks
•
Simple Self-Attention
•
Attention (rasbt)
Blogposts
•
Attention (Lilian)
•
Annotated Transformer (Harvard)
•
Illustrated Transformer (Alammar)
•
Transformers (Lilian)
•
Programming Self-Attention (Raschka)
Papers
•
Attention is All You Need (Vaswani et al.)
•
Transformer Family Tree (Amatriain et al.)
•
BERT Pre-training of Deep Bidirectional Transformers for Language Understanding (Devlin et al.)
•
Improving Language Understanding by Generative Pre-Training (Radford et al.)
Quiz
¶
Loading…