CAP theorem cover illustrationA triangle showing consistency, availability, and partition tolerance with a note that partitions force a trade-off between consistency and availability.CAP TheoremHow distributed systems choose behaviorwhen the network stops cooperatingCConsistencyAAvailabilityPPartitionToleranceNetwork splitWhen a partition happens, you usually choose CP or AP.

CAP Theorem Explained: Consistency vs Availability in Distributed Systems

The CAP theorem is one of the most important ideas in distributed systems because it explains why “just make it always correct and always online” is not a realistic requirement once multiple nodes and unreliable networks enter the picture. In simple terms, CAP says that when a network partition happens, a distributed system can prioritize either: Consistency Availability But not both at the same time. Partition tolerance is not a feature you casually add or remove. If your system runs across multiple machines, partitions are a fact of life, so the real design choice is usually CP vs AP. ...

March 9, 2026 · 4 min · Nitin
How Attention EvolvedFrom sequence-to-sequence alignment to long-context decoder efficiency.2014Bahdanauadditive attention2015Luongdot-product styles2017Transformermulti-head self-attn2019-2023Sparse, local,linear, MQA, GQA2024-2025DeepSeekMLA focusThe trend is consistent: keep the expressive power of attention, then remove its biggest bottlenecks.

Attention Mechanisms Explained: Self-Attention, Cross-Attention, Sparse Attention, MQA, GQA, and DeepSeek MLA

Attention is the idea that made modern transformers practical and powerful. Instead of compressing an entire input into one fixed vector, a model can decide, token by token, which earlier pieces of information matter most right now. That sounds simple, but there are many different kinds of attention mechanisms, and they exist because models face different constraints: some need strong alignment between an encoder and a decoder some need to generate text one token at a time without looking ahead some need to handle very long documents some need to reduce GPU memory traffic at inference time This article walks through the main families of attention, shows where they fit, and explains why newer variants such as DeepSeek’s multi-head latent attention (MLA) matter. ...

March 9, 2026 · 14 min · Nitin

How to Search AWS CloudWatch Logs Effectively

When people say they want to “search in CloudWatch”, what they usually need is CloudWatch Logs Insights. It is much more useful than manually opening individual log streams because you can search across log groups, combine conditions, sort by timestamp, and limit results quickly. That said, AWS also has the basic log search interface inside a log group. If you select all streams and search there, the syntax is different. It uses filter patterns, not Logs Insights query language. ...

March 7, 2026 · 7 min · Nitin
Stripe hosted checkout flow using webhook-first fulfillment and order status polling

Integrating Stripe Payment with Checkout Flow (Webhooks + Polling)

If you use a hosted payment page like Stripe Checkout, the safest architecture is: Create an internal order before redirecting. Redirect the user to the hosted checkout URL. Use webhooks as the source of truth for fulfillment. Let the frontend poll order status after return. This avoids race conditions and ensures users still get entitlements even if they close the tab before the success page loads. Why This Pattern Works Hosted checkout redirects are excellent for UX and compliance, but redirects are not guaranteed delivery signals. ...

March 3, 2026 · 2 min · Nitin

How We Added a Developer Tools Section in Hugo (Client-Side Only)

We recently added a dedicated Developer Tools section to Learn Code Camp and shipped multiple utility tools in one go. The goal was simple: Client-side only. Your data stays in your browser on this page. That requirement shaped every implementation choice. What We Added We added a new /tools section with these live tools: JSON Formatter + Validator Base64 Encode/Decode URL Encode/Decode UUID Generator (v4) Unix Timestamp Converter JWT Decoder Regex Tester Text Diff Checker Hash Generator (SHA-256, MD5) Why Client-Side Only? For utility tools, people often paste sensitive payloads: tokens, configs, logs, API responses, and JSON with private fields. ...

February 24, 2026 · 3 min · Nitin

How Much GPU VRAM Do You Need to Run Large Language Models?

If you’re planning to run open-weight LLMs locally or in production, one of the first questions is: How much GPU VRAM do I actually need? The answer depends on three major components: Model weights KV cache (context memory) Runtime overhead Let’s break each one down clearly and practically. 1️⃣ Model Weights: The Base Memory Cost The largest fixed memory cost comes from the model weights. Simple Formula Weights (GB) ≈ Parameters (in billions) × (bits per weight / 8) ...

February 16, 2026 · 4 min · Nitin
Migrating WordPress to Hugo with Cloudflare Pages

How to Migrate WordPress to Hugo with Decap CMS and Cloudflare Pages (Free Hosting)

Why Migrate from WordPress? WordPress is powerful, but for a technical blog that mostly serves static content, it comes with unnecessary overhead — hosting costs, plugin updates, security patches, and slower page loads. Static site generators like Hugo offer a simpler, faster, and cheaper alternative. Here’s what we migrated to: Hugo — blazing fast static site generator PaperMod — clean, minimal theme perfect for tech blogs Decap CMS — web-based content management with GitHub backend Cloudflare Pages — free hosting with global CDN Google AdSense — preserved auto ads from the WordPress site The result? A site that builds in under 1 second, costs $0/month to host, and is served from Cloudflare’s global edge network. ...

February 8, 2026 · 5 min · Nitin

Agentic Vision in Gemini 3 Flash: Turning “Seeing” into an Active Investigation

Frontier vision models have gotten really good at understanding images — but they’ve also had a consistent weakness: They still often treat an image like a single static glance. So if the answer depends on something tiny (a serial number, a distant street sign, a gauge reading, a small UI label), the model might miss it… and then it has to guess. Google’s new capability called Agentic Vision, launched with Gemini 3 Flash, is a major step toward fixing that. ...

January 29, 2026 · 5 min · Nitin

Understanding LLM Inference Basics: Prefill and Decode, TTFT, and ITL

Large language models (LLMs) like GPT-4, Llama, or Grok generate text by running inference — the phase where a trained model produces outputs from a given input prompt. While training is resource-intensive and done once, inference happens every time a user sends a query. Understanding the mechanics of inference is key to grasping why some models feel “fast” while others lag, and why certain optimizations matter. At a high level, modern LLM inference (for autoregressive transformer-based models) splits into two distinct phases: prefill and decode. These phases behave very differently in terms of computation and directly affect two critical user-facing metrics: Time to First Token (TTFT) and Inter-Token Latency (ITL). ...

December 21, 2025 · 5 min · Nitin

Analysis of open ai home directory

Recently, someone shared a screenshot on x.com, how to download OpenAI Home Directories. I tried it, and it works. In this blog, we will now try to understand exactly what the contents of this home directory are. working with GPT-5.2 thinking with gpt 5.2, i got error zip file not found. https://t.co/c1zTfBlWb9 pic.twitter.com/85tEv28MuJ — Nitin Kalra (@nkalra0123) <a href="https://twitter.com/nkalra0123/status/1999771366397231386?ref_src=twsrc%5Etfw">December 13, 2025</a> Let’s analyse the contents Inside the open ai home directory oai/ Folder: Slides, Docs, PDFs, and Spreadsheets Tooling This folder is a small toolkit for working with common “office” artifacts – PowerPoint decks, DOCX files, PDFs, and spreadsheets. It combines a few Python utilities with a set of practical guides that describe the preferred tools and a quality-check workflow (render → visually inspect → iterate). ...

December 13, 2025 · 5 min · Nitin