Tag: deep-learning
All the articles with the tag "deep-learning".
-
From Code Points to Subwords: Building a Byte-Level BPE Tokenizer
A walk through tokenization as taught in Stanford's CS336, from Unicode primitives to a full byte-level BPE tokenizer with encode/decode. Covers why UTF-8 wins, how BPE as compression generalizes to language modeling, and the data structures that make the merge loop tractable on a 10GB corpus.
-
From MLE to Variational Inference
A comprehensive exploration of the mathematical progression from maximum likelihood estimation through latent variable models to advanced variational inference techniques, including practical VAE implementations and extensions.
-
From Exponential Complexity to Chain Rules: Understanding Autoregressive Generative Models
A deep dive into the fundamentals of generative modeling, exploring how chain rule factorization solves the curse of dimensionality and enables everything from Bayesian networks to modern Transformers.