ONNX models can be loaded and executed with transformer.js https://github.com/huggingface/transformers.js/
You can even build and statically host indices like hnsw for embeddings.
I put together a little open source demo for this here https://jasonjmcghee.github.io/portable-hnsw/ (it's a prototype / hacked together approximation of hnsw, but you could implement the real thing)
Long story short, represent indices as queryable parquet files and use duckdb to query them.
Depending on how you host, it's either free or nearly free. I used Github Pages so it's free. R2 with cloudflare would only cost the size what you store (very cheap- no egress fees).