Reduce LLM inference cost
for AI character chat
TypoMonster Chat is an LLM orchestration layer built for AI character chat. Cut inference costs with intelligent routing, caching, and analytics — all through a drop-in SDK replacement.
Get started in two steps
Issue an API key
Create a project and generate an API key from the dashboard. Each key tracks usage and costs independently.
Drop in the ai-proxy SDK
Replace your existing AI SDK with @ai-proxy/core. Same interface, lower costs — no code rewrite needed.
import { generateText } from "ai";import { createGoogleGenerativeAI } from "@ai-sdk/google"; const google = createGoogleGenerativeAI({ apiKey: process.env.GOOGLE_API_KEY,}); const { text } = await generateText({ model: google("gemini-3.1-pro"), prompt: "Hello!",});import { generateText } from "ai";import { createProxyGoogle } from "@ai-proxy/google"; const google = createProxyGoogle({ apiKey: process.env.TYPOMONSTER_API_KEY,}); const { text } = await generateText({ model: google("gemini-3.1-pro"), prompt: "Hello!",});Features
Realtime Analytics
Monitor token usage, latency, and costs across all providers in real time. Spot anomalies and optimize spend from a single dashboard.
Playground
Try it without writing any code. Test prompts against multiple models side-by-side, tweak parameters, and see how our system works before integrating.
Developer Friendly
Works with the tools you already use. Drop in our proxy SDK, use the OpenAI-compatible API, call via cURL, or integrate with the Vercel AI SDK — your choice.
import { createProxyGoogle } from "@ai-proxy/google";
const google = createProxyGoogle({
apiKey: process.env.TYPOMONSTER_API_KEY,
});