Bits-per-Byte (BPB): a tokenizer-agnostic way to measure LLMs

Karpathy recently released nanochat repo which cotains code for training the best ChatGPT under $100. While skimming the high level code, I noticed across bits per bytes instead of typical cross entropy loss. And, i found it interesting, so i decided to dig in. TL;DR Bit per byte (BPB) is just cross-entropy measured per byte. We divide cross-entropy by bytes and log(2) to convert to bits. Because it’s per byte, BPB is tokenizer-agnostic and lets you compare models fairly even when they use different vocabularies and rules....

October 15, 2025 | Estimated Reading Time: 4 min |  Author: Dipkumar Patel

GPT-5 Router - Inevitable Future of Chat Interfaces

OpenAI GPT-5 Router is like Apple removing headphone jack. It sucks but everyone will follow it. — immortal (@immortal_0698) August 14, 2025 What is GPT-5 Router The GPT-5 router picks the right model for each request in real time. In plain English: easy stuff goes to the small model; complex stuff goes to the big brain. The goal is simple, better answers per dollar and millisecond by mixing models instead of forcing a single static choice....

August 13, 2025 | Estimated Reading Time: 4 min |  Author: Dipkumar Patel

AWS BedRock - Converse API - A single endpoint for all models ?

Amazon Bedrock is a fully managed service that makes high-performing foundation models (FMs) from leading AI startups and Amazon available for your use through a unified API. You can choose from a wide range of foundation models to find the model that is best suited for your use case. Amazon Bedrock also offers a broad set of capabilities to build generative AI applications with security, privacy, and responsible AI. With Amazon Bedrock, you can easily experiment with and evaluate top foundation models for your use cases, privately customize them with your data using techniques such as fine-tuning and Retrieval Augmented Generation (RAG), and build agents that execute tasks using your enterprise systems and data sources....

June 13, 2024 | Estimated Reading Time: 4 min |  Author: Dipkumar Patel