←back to thread

Optimizing Datalog for the GPU

(danglingpointers.substack.com)
98 points blakepelton | 1 comments | | HN request time: 0.001s | source
Show context
ux266478 ◴[] No.45812302[source]
Curious, why use cuda and hip? These frameworks are rather opinionated about kernel design, they seem suboptimal for implementing a language runtime when SPIR-V is right there, particularly in the case of datalog.
replies(4): >>45812644 #>>45812676 #>>45812962 #>>45814792 #
1. lmeyerov ◴[] No.45812962[source]
(have been a big fan of this work for years now)

From the nearby perspective of building GFQL, an embeddable oss GPU graph dataframe query language somewhere between cypher and duckdb/pandas/spark, at an even higher-level on top of pandas, cudf, etc:

It's nice using higher-level languages with rich libraries underneath so we can focus on the foundational algorithm & data ecosystem problems while still achieving crazy numbers

cudf gives us optimized GPU joins, so jumping from cheap personal CPU or GPU boxes to 80GB server GPUs and deep 2B edge whole-graph queries running in a second without work has been nice :) we want our focus on getting regular graph operations fully data parallel in the way we want while being easy for users, figuring out areas like bigger-than-memory and data lakes, etc, so we want to defer lower-level efforts to when the rust etc rewrite is more merited. I do see value in starting low when the target value and workload is obvious for building our (eg, vector indexes / DBs), but when breaking new ground at every point, value to going where you can roll & extend faster.