Learn Code Camp

Training Loop Explained: Batches, Epochs, Iterations, and Convergence

Once you understand neurons, activations, loss functions, and backpropagation, the next thing to understand is the training loop. This is the repetitive engine of deep learning. At a high level, training is boring in the best possible way. It is the same four steps repeated over and over: make a prediction measure the error compute gradients update the weights The interesting part is not the loop itself. The interesting part is how concepts like batch size, epoch, iteration, and convergence affect the behavior of that loop in practice. ...

MSE vs Cross-Entropy: Which Loss Function Should You Use?

Loss functions answer one basic question: How wrong is the model right now? Without a loss function, a neural network has no way to measure its own mistakes, and without that measurement, gradient-based training has nothing to optimize. Two of the most important losses in machine learning are: Mean Squared Error (MSE) Cross-Entropy They are both common. They are both differentiable. But they solve different kinds of problems, and using the wrong one makes training harder than it needs to be. ...

Multi-Layer Perceptron Explained: Dense Networks from First Principles

A multi-layer perceptron (MLP) is one of the simplest and most important neural network architectures. It is not flashy. It is not state of the art for language or vision by itself. But if you do not understand MLPs, a lot of modern deep learning stays blurry. MLPs teach the core structure of neural networks: inputs become vectors layers apply learned linear transforms activations add nonlinearity deeper layers build more useful internal representations They also still matter in practice. Even transformers contain MLP blocks. Recommendation systems, tabular models, and many small classifiers still use dense networks directly. ...

Forward Pass Explained: From a Single Neuron to Matrix Form

The forward pass is the part of a neural network that actually produces a prediction. You feed inputs into the model, the model applies a sequence of mathematical operations, and an output comes out the other side. That sounds trivial, but it is one of the most important ideas in deep learning because everything else depends on it: the loss compares the forward-pass output to the target backpropagation differentiates through the forward pass training is just repeating the forward pass and improving it The easiest way to understand the forward pass is to start with a single neuron and then scale it up into a full layer written in matrix form. ...

Backpropagation Explained Visually: How Neural Networks Actually Learn

Backpropagation is the core algorithm that makes neural networks trainable. The forward pass tells the model what prediction it currently makes. Backpropagation tells the model how each weight contributed to the error so the optimizer can update those weights in the right direction. People often hear that backpropagation is “just the chain rule,” which is true but not especially helpful. The useful mental model is this: the forward pass computes values the backward pass computes sensitivities each node only needs its own local derivative the full gradient is built by multiplying those local derivatives along the path If that sounds abstract, it becomes much clearer once you look at one neuron first and then scale up. ...

Source Maps Explained: How They Work and Why They Sometimes Leak Source Code

Most developers only think about source maps when DevTools magically shows the original TypeScript instead of unreadable bundled JavaScript. That convenience hides an important fact: A source map is not just “debug metadata.” It is a translation table between generated code and original source code. And depending on how it is emitted, it can contain the original source itself. That is why source maps sit at the intersection of: debugging build tooling browser DevTools error reporting systems like Sentry security and accidental code exposure If you have ever wondered how a minified file can still produce readable stack traces, or how a published .map file can expose a package’s real TypeScript source, this is the mental model you want. ...

How Adblock Extensions Work and How to Customize Their Behavior

When people think about adblock extensions, they usually imagine something simple: “The extension sees an ad and hides it.” That is only part of the story. Tools like uBlock Origin are better understood as content blockers, not just ad blockers. They do block ads, but they also block: trackers popups malware domains anti-blocker scripts other unwanted page behavior Modern blockers such as uBlock Origin mostly work by applying rules to: ...

RoPE Explained: The Positional Encoding Trick Behind Modern Language Models

When people talk about transformers, they usually focus on attention, scale, or training data. But one smaller design choice has an outsized effect on model quality: How does the model know where each token appears in the sequence? That question matters because transformers do not understand order by default. Without positional information, a sequence starts to look more like an unordered set of tokens than a structured sentence, paragraph, or program. ...

GPT-2 XL architecture diagram showing token embeddings, positional embeddings, 48 transformer blocks, 25 attention heads, and the output layer

Understanding LLM Architecture: Layers, Transformer Blocks, and Attention Heads

Large Language Models (LLMs) such as GPT-2, GPT-3, LLaMA, and BERT are built on top of the Transformer architecture. That architecture changed natural language processing by replacing recurrence with attention, which lets models process sequences more efficiently and capture long-range relationships more directly. If you are trying to understand what terms like layer, transformer block, and attention head actually mean, the easiest way is to follow the path a sentence takes through a GPT-style model. ...

How Much Do LLMs Hallucinate in Document Q&A? Key Lessons from a 172B-Token Study

If you are building a RAG system, internal knowledge assistant, or document search chatbot, one question matters more than almost anything else: When the answer is supposed to come from the provided documents, how often does the model still make things up? That is exactly what the March 9, 2026 paper “How Much Do LLMs Hallucinate in Document Q&A Scenarios? A 172-Billion-Token Study Across Temperatures, Context Lengths, and Hardware Platforms” tries to measure. ...