Most active commenters
  • threeseed(3)

←back to thread

138 points FrasiertheLion | 17 comments | | HN request time: 1.524s | source | bottom

Hello HN! We’re Tanya, Sacha, Jules and Nate from Tinfoil: https://tinfoil.sh. We host models and AI workloads on the cloud while guaranteeing zero data access and retention. This lets us run open-source LLMs like Llama, or Deepseek R1 on cloud GPUs without you having to trust us—or any cloud provider—with private data.

Since AI performs better the more context you give it, we think solving AI privacy will unlock more valuable AI applications, just how TLS on the Internet enabled e-commerce to flourish knowing that your credit card info wouldn't be stolen by someone sniffing internet packets.

We come from backgrounds in cryptography, security, and infrastructure. Jules did his PhD in trusted hardware and confidential computing at MIT, and worked with NVIDIA and Microsoft Research on the same, Sacha did his PhD in privacy-preserving cryptography at MIT, Nate worked on privacy tech like Tor, and I (Tanya) was on Cloudflare's cryptography team. We were unsatisfied with band-aid techniques like PII redaction (which is actually undesirable in some cases like AI personal assistants) or “pinky promise” security through legal contracts like DPAs. We wanted a real solution that replaced trust with provable security.

Running models locally or on-prem is an option, but can be expensive and inconvenient. Fully Homomorphic Encryption (FHE) is not practical for LLM inference for the foreseeable future. The next best option is using secure enclaves: a secure environment on the chip that no other software running on the host machine can access. This lets us perform LLM inference in the cloud while being able to prove that no one, not even Tinfoil or the cloud provider, can access the data. And because these security mechanisms are implemented in hardware, there is minimal performance overhead.

Even though we (Tinfoil) control the host machine, we do not have any visibility into the data processed inside of the enclave. At a high level, a secure enclave is a set of cores that are reserved, isolated, and locked down to create a sectioned off area. Everything that comes out of the enclave is encrypted: memory and network traffic, but also peripheral (PCIe) traffic to other devices such as the GPU. These encryptions are performed using secret keys that are generated inside the enclave during setup, which never leave its boundaries. Additionally, a “hardware root of trust” baked into the chip lets clients check security claims and verify that all security mechanisms are in place.

Up until recently, secure enclaves were only available on CPUs. But NVIDIA confidential computing recently added these hardware-based capabilities to their latest GPUs, making it possible to run GPU-based workloads in a secure enclave.

Here’s how it works in a nutshell:

1. We publish the code that should run inside the secure enclave to Github, as well as a hash of the compiled binary to a transparency log called Sigstore

2. Before sending data to the enclave, the client fetches a signed document from the enclave which includes a hash of the running code signed by the CPU manufacturer. It then verifies the signature with the hardware manufacturer to prove the hardware is genuine. Then the client fetches a hash of the source code from a transparency log (Sigstore) and checks that the hash equals the one we got from the enclave. This lets the client get verifiable proof that the enclave is running the exact code we claim.

3. With the assurance that the enclave environment is what we expect, the client sends its data to the enclave, which travels encrypted (TLS) and is only decrypted inside the enclave.

4. Processing happens entirely within this protected environment. Even an attacker that controls the host machine can’t access this data. We believe making end-to-end verifiability a “first class citizen” is key. Secure enclaves have traditionally been used to remove trust from the cloud provider, not necessarily from the application provider. This is evidenced by confidential VM technologies such as Azure Confidential VM allowing ssh access by the host into the confidential VM. Our goal is to provably remove trust both from ourselves, aka the application provider, as well as the cloud provider.

We encourage you to be skeptical of our privacy claims. Verifiability is our answer. It’s not just us saying it’s private; the hardware and cryptography let you check. Here’s a guide that walks you through the verification process: https://docs.tinfoil.sh/verification/attestation-architectur....

People are using us for analyzing sensitive docs, building copilots for proprietary code, and processing user data in agentic AI applications without the privacy risks that previously blocked cloud AI adoption.

We’re excited to share Tinfoil with HN!

* Try the chat (https://tinfoil.sh/chat): It verifies attestation with an in-browser check. Free, limited messages, $20/month for unlimited messages and additional models

* Use the API (https://tinfoil.sh/inference): OpenAI API compatible interface. $2 / 1M tokens

* Take your existing Docker image and make it end to end confidential by deploying on Tinfoil. Here's a demo of how you could use Tinfoil to run a deepfake detection service that could run securely on people's private videos: https://www.youtube.com/watch?v=_8hLmqoutyk. Note: This feature is not currently self-serve.

* Reach out to us at contact@tinfoil.sh if you want to run a different model or want to deploy a custom application, or if you just want to learn more!

Let us know what you think, we’d love to hear about your experiences and ideas in this space!

1. Etheryte ◴[] No.43997573[source]
How large do you wager your moat to be? Confidential computing is something all major cloud providers either have or are about to have and from there it's a very small step to offer LLM-s under the same umbrella. First mover advantage is of course considerable, but I can't help but feel that this market will very quickly be swallowed by the hyperscalers.
replies(6): >>43997649 #>>43997710 #>>43997740 #>>43997910 #>>44000189 #>>44004425 #
2. ◴[] No.43997649[source]
3. itsafarqueue ◴[] No.43997710[source]
Being gobbled by the hyperscalers may well be the plan. Reasonable bet.
replies(1): >>44000908 #
4. 3s ◴[] No.43997740[source]
Confidential computing as a technology will become (and should be) commoditized, so the value add comes down to security and UX. We don’t want to be a confidential computing company, we want to use the right tool for the job of building private & verifiable AI. If that becomes FHE in a few years, then we will use that. We are starting with easy-to-use inference, but our goal of having any AI application be provably private
5. ATechGuy ◴[] No.43997910[source]
This. Big tech providers already offer confidential inference today.
replies(2): >>43998018 #>>43998096 #
6. julesdrean ◴[] No.43998018[source]
Yes Azure has! They have very different trust assumptions though. We wrote about this here https://tinfoil.sh/blog/2025-01-30-how-do-we-compare
7. mnahkies ◴[] No.43998096[source]
Last I checked it was only Azure offering the Nvidia specific confidential compute extensions, I'm likely out of date - a quick Google was inconclusive.

Have GCP and AWS started offering this for GPUs?

replies(1): >>43998338 #
8. candiddevmike ◴[] No.43998338{3}[source]
GCP, yes: https://cloud.google.com/confidential-computing/confidential...
replies(1): >>43998503 #
9. julesdrean ◴[] No.43998503{4}[source]
Azure and GCP offer Confidential VMs which removes trust from the cloud providers. We’re trying to also remove trust in the service provider (aka ourselves). One example is that when you use Azure or GCP, by default, the service operator can SSH into the VM. We cannot SSH into our inference server and you can check that’s true.
replies(1): >>44000232 #
10. threeseed ◴[] No.44000189[source]
Cloud providers aren't going to care too much about this.

I have worked for many enterprise companies e.g. banks who are trialling AI and none of them have any use for something like this. Because the entire foundation of the IT industry is based on trusting the privacy and security policies of Azure, AWS and GCP. And in the decades since they've been around not heard of a single example of them breaking this.

The proposition here is to tell a company that they can trust Azure with their banking websites, identity services and data engineering workloads but not for their model services. It just doesn't make any sense. And instead I should trust a YC startup who statistically is going to be gone in a year and will likely have their own unique set of security and privacy issues.

Also you have the issue of smaller sized open source models e.g. DeepSeek R1 lagging far behind the bigger ones and so you're giving me some unnecessary privacy attestation at the expense of a model that will give me far better accuracy and performance.

replies(1): >>44004831 #
11. threeseed ◴[] No.44000232{5}[source]
But nobody wants you as a service provider. Everyone wants to have Gemini, OpenAI etc which are significantly better than the far smaller and less capable model you will be able to afford to host.

And you make this claim that the cloud provider can SSH into the VM but (a) nobody serious exposes SSH ports in Production and (b) there is no documented evidence of this ever happening.

replies(1): >>44000398 #
12. FrasiertheLion ◴[] No.44000398{6}[source]
We're not competing with Gemini or OpenAI or the big cloud providers. For instance, Google is partnering with NVIDIA to ship Gemini on-prem to regulated industries in a CC environment to protect their model weights as well as for additional data privacy on-prem: https://blogs.nvidia.com/blog/google-cloud-next-agentic-ai-r...

We're simply trying to bring similar capabilities to other companies. Inference is just our first product.

>cloud provider can SSH into the VM

The point we were making was that CC was traditionally used to remove trust from cloud providers, but not the application provider. We are further removing trust from ourselves (as the application provider), and we can enable our customers (who could be other startups or neoclouds) to remove trust from themselves and prove that to their customers.

replies(1): >>44000513 #
13. threeseed ◴[] No.44000513{7}[source]
You are providing the illusion of trust though.

There are a multitude of components between my app and your service. You have secured one of them arguably the least important. But you can't provide any guarantees over say your API server that my requests are going through. Or your networking stack which someone e.g. a government could MITM.

replies(1): >>44000642 #
14. osigurdson ◴[] No.44000642{8}[source]
I don't know anything about "secure enclaves" but I assume that this part is sorted out. It should be possible to use http with it I imagine. If not, yeah it is totally dumb from a conceptual standpoint.
15. kevinis ◴[] No.44000908[source]
GCP has confidential VMs with H100 GPUs; I'm not sure if Google would be interested. And they get huge discount buying GPUs in bulk. The trade-off between cost and privacy is obvious for most users imo.
16. trebligdivad ◴[] No.44004425[source]
I suspect Nvidia have done a lot of the heavy lifting to make this work; but it's not that trivial to wire the CPU and GPU confidential compute together.
17. Terretta ◴[] No.44004831[source]
> Cloud providers aren't going to care too much about this. ... [E]nterprise companies e.g. banks ... and none of them have any use for something like this.

As former CTO of world's largest bank and cloud architect at world's largest hedge fund, this is exactly opposite of my experience with both regulated finance enterprises and the CSPs vying to serve them.

The entire foundation of the IT industry is based on trusting the privacy and security policies of Azure, AWS and GCP. And in the decades since they've been around not heard of a single example of them breaking this.

On the contrary, many global banks design for the assumption the "CSP is hostile". What happened to Coinbase's customers the past few months shows why your vendor's insider threat is your threat and your customers' threat.

Granted, this annoys CSPs who wish regulators would just let banks "adopt" the CSP's controls and call it a day.

Unfortunately for CSP sales teams — certainly this could change with recent regulator policy changes — the regulator wins. Until very recently, only one CSP offered controls sufficient to assure your own data privacy beyond a CSP's pinky-swears. AWS Nitro Enclaves can provide a key component in that assurance, using deployment models such as tinfoil.