Learn AI Knowledge BaseLlm basics

Transformer

Core Papers and Code

Key Concepts

QKV computation in Self-Attention
The role of Scaled Dot-Product
Principles of Multi-Head Attention
Tokenization and Tokenizer
Word Embedding
Positional Encoding
Attention Mechanism
Feed Forward Network
Masking
Layer Normalization
Decoding Techniques

Deep Dive

Transformer paper paragraph-by-paragraph reading [Paper Reading]

Attention Mechanism Learning Resources

[HD bilingual subtitles] Andrew Ng explains Transformer working principles in detail (2025)
- Original course link (DeepLearning.AI Short Courses)
Mastering the Attention Mechanism thoroughly

贡献者

这篇文章有帮助吗？

最近更新

Involution Hell© 2026 byCommunityunderCC BY-NC-SA 4.0

PyTorch

PyTorch deep learning framework: beginner tutorials, tensor operations, interview essentials

AI by Hand: Build AI Models Manually

Learn core AI model computations by hand—Transformer attention, matrix operations—with illustrated walkthroughs for learners seeking fundamental logic.

On this page

Core Papers and Code Key Concepts Deep Dive Attention Mechanism Learning Resources