(danglingpointers.substack.com)

98 points blakepelton | 4 comments | 04 Nov 25 14:31 UTC | HN request time: 0.001s | source

Show context

ux266478 ◴[04 Nov 25 15:48 UTC] No.45812302[source]▶

Curious, why use cuda and hip? These frameworks are rather opinionated about kernel design, they seem suboptimal for implementing a language runtime when SPIR-V is right there, particularly in the case of datalog.

replies(4): >>45812644 #>>45812676 #>>45812962 #>>45814792 #

1. embedding-shape ◴[04 Nov 25 16:16 UTC] No.45812644[source]▶

>>45812302 #

Why is cuda sub-optimal compared to SPIR-V? I don't think I know the internals enough to understand if it's supposed to be obvious why one is better than the other.

I'm currently sitting and learning cuda for ML purposes, so happy to get more educated :)

replies(1): >>45812899 #

2. jb1991 ◴[04 Nov 25 16:36 UTC] No.45812899[source]▶

>>45812644 (TP) #

Just depends on how the manufacturer of the GPU handles code written in different languages. For example, what level of API access, what level of abstraction, and how is the source compiled i.e. how optimized is it. For example, on an apple GPU, you’ll see benchmarks that openCL and metal can vary depending on the tasks.

replies(2): >>45812945 #>>45816345 #

3. embedding-shape ◴[04 Nov 25 16:40 UTC] No.45812945[source]▶

>>45812899 #

Right, but that'd depend a lot on the context, task, hardware and so on.

What parent said seemed more absolute and less relative, almost positing it as there is no point in using cuda (since it's "sub-optimal" and people should use SPIR-V obviously. I was curious in the specifics about that.

4. sigbottle ◴[04 Nov 25 22:00 UTC] No.45816345[source]▶

>>45812899 #

I mean, nvidia exposes some pretty low level primitives, and you can always fiddle with the PTX as deepseek did.

↑

Optimizing Datalog for the GPU