Thoughts, stories and ideas
A deep dive into the 2017 Transformer paper by Vaswani et al. that eliminated RNNs, introduced self-attention, and laid the groundwork for GPT, BERT, and every modern LLM.