AI-native benchmark data

LLM API latency facts that crawlers can quote.

llmping publishes native HTML tables, markdown mirrors, and JSON downloads for LLM API latency. Each benchmark row includes provider, model, region, P50, P95, P99, TTFT, tokens per second, sample count, and collection timestamp.

Fastest P50 row
302ms

Groq llama-3.3-70b from US East

Dataset window
05-01

2026-05-01 through 2026-05-12

Rows exposed
10

Native HTML, markdown, and JSON from the same source.

Current leaderboard sample

Fastest median rows

Open full leaderboard
Provider Model Region P50 P95 TTFT Collected
Groq llama-3.3-70b US East 302ms 770ms 360ms May 12, 2026, 01:53 PM UTC
OpenAI gpt-4o US East 342ms 891ms 410ms May 12, 2026, 01:55 PM UTC
OpenAI gpt-4o-mini US West 378ms 936ms 442ms May 12, 2026, 01:54 PM UTC
DeepSeek deepseek-chat Singapore 388ms 990ms 456ms May 12, 2026, 02:00 PM UTC
Anthropic claude-3-5-sonnet US East 416ms 1048ms 492ms May 12, 2026, 01:56 PM UTC