Reduce LLM inference cost
for AI character chat
TypoMonster Chat is an LLM orchestration layer built for AI character chat. Cut inference costs with intelligent routing, caching, and analytics — all through a drop-in SDK replacement.
Get started in two steps
Issue an API key
Create a project and generate an API key from the dashboard. Each key tracks usage and costs independently.
Drop in the ai-proxy SDK
Replace your existing AI SDK with @typochat-sdk/core. Same interface, lower costs — no code rewrite needed.
import { generateText } from "ai";import { createGoogleGenerativeAI } from "@ai-sdk/google"; const google = createGoogleGenerativeAI({ apiKey: process.env.GOOGLE_API_KEY,}); const { text } = await generateText({ model: google("gemini-3.1-pro-preview"), prompt: "Hello!",});import { generateText } from "ai";import { createProxyGoogle } from "@typochat-sdk/google"; const google = createProxyGoogle({ apiKey: process.env.TYPOMONSTER_API_KEY,}); const { text } = await generateText({ model: google("gemini-3.1-pro-preview"), prompt: "Hello!",});Features
Realtime Analytics
Monitor token usage, latency, and costs across all providers in real time. Spot anomalies and optimize spend from a single dashboard.
Playground
Try it without writing any code. Test prompts against multiple models side-by-side, tweak parameters, and see how our system works before integrating.
Developer Friendly
Works with the tools you already use. Drop in our Google proxy SDK, or use Google Vertex through Express Mode with createProxyVertex and the Vercel AI SDK.
import { createProxyGoogle } from "@typochat-sdk/google";
const google = createProxyGoogle({
apiKey: process.env.TYPOMONSTER_API_KEY, // express mode
});