VLM Run is a first-of-its-kind API dedicated to running Vision Language Models on Documents, Images, and Video. We’re building a stack from the bottom-up for ‘Visual’ applications of language models that we believe will make up > 90% of inference needs in the next 5 years.
Hybrid from Bay Area, CA
Looking for experience in any of the following:
* ML Domains: Vision Language Models, LLMs, Temporal/Video Models
* Model Training, Evaluation, and Versioning platforms: WnB, Huggingface
* Infra: Python, Pytorch, Pydantic, CUDA, Torch.compile
* Devops: Github CI, Docker, Conda, API Billing and Monitoring