(llmapitest.com)

55 points mrqjr | 1 comments | 29 Jun 25 15:33 UTC | HN request time: 0.284s | source

I recently built a small open-source tool to benchmark different LLM API endpoints — including OpenAI, Claude, and self-hosted models (like llama.cpp).

It runs a configurable number of test requests and reports two key metrics: • First-token latency (ms): How long it takes for the first token to appear • Output speed (tokens/sec): Overall output fluency

Demo: https://llmapitest.com/ Code: https://github.com/qjr87/llm-api-test

The goal is to provide a simple, visual, and reproducible way to evaluate performance across different LLM providers, including the growing number of third-party “proxy” or “cheap LLM API” services.

It supports: • OpenAI-compatible APIs (official + proxies) • Claude (via Anthropic) • Local endpoints (custom/self-hosted)

You can also self-host it with docker-compose. Config is clean, adding a new provider only requires a simple plugin-style addition.

Would love feedback, PRs, or even test reports from APIs you’re using. Especially interested in how some lesser-known services compare.

Show context

mdhb ◴[29 Jun 25 18:21 UTC] No.44415167[source]▶

>>44413921 (OP) #

In what universe is a post created by a new account with zero comments and a grand total of 2 votes over the course of 2 hours doing on the front page?

replies(5): >>44415292 #>>44415964 #>>44417201 #>>44417318 #>>44418605 #

1. bdangubic ◴[29 Jun 25 22:48 UTC] No.44417318[source]▶

>>44415167 #

I am polishing up my blog about some FORTRAN code I wrote last week in hopes of the same :)

↑

Show HN: A tool to benchmark LLM APIs (OpenAI, Claude, local/self-hosted)