Tag: deep-learning

All the articles with the tag "deep-learning".

From Code Points to Subwords: Building a Byte-Level BPE Tokenizer

20 Apr, 2026

A walk through tokenization as taught in Stanford's CS336, from Unicode primitives to a full byte-level BPE tokenizer with encode/decode. Covers why UTF-8 wins, how BPE as compression generalizes to language modeling, and the data structures that make the merge loop tractable on a 10GB corpus.
From MLE to Variational Inference

10 Aug, 2025

A comprehensive exploration of the mathematical progression from maximum likelihood estimation through latent variable models to advanced variational inference techniques, including practical VAE implementations and extensions.
From Exponential Complexity to Chain Rules: Understanding Autoregressive Generative Models

6 Aug, 2025

A deep dive into the fundamentals of generative modeling, exploring how chain rule factorization solves the curse of dimensionality and enables everything from Bayesian networks to modern Transformers.

From Code Points to Subwords: Building a Byte-Level BPE Tokenizer