←back to thread

Getting AI to write good SQL

(cloud.google.com)
478 points richards | 1 comments | | HN request time: 0.207s | source
Show context
mritchie712 ◴[] No.44010031[source]
the short answer: use a semantic layer.

It's the cleanest way to give the right context and the best place to pull a human in the loop.

A human can validate and create all important metrics (e.g. what does "monthly active users" really mean) then an LLM can use that metric definition whenever asked for MAU.

With a semantic layer, you get the added benefit of writing queries in JSON instead of raw SQL. LLM's are much more consistent at writing a small JSON vs. hundreds of lines of SQL.

We[0] use cube[1] for this. It's the best open source semantic layer, but there's a couple closed source options too.

My last company wrote a post on this in 2021[2]. Looks like the acquirer stopped paying for the blog hosting, but the HN post is still up.

0 - https://www.definite.app/

1 - https://cube.dev/

2 - https://news.ycombinator.com/item?id=25930190

replies(7): >>44010358 #>>44011108 #>>44011775 #>>44011802 #>>44012638 #>>44013043 #>>44013772 #
tclancy ◴[] No.44011775[source]
Mother of God. I can write JSON instead of a language designed for querying. What is the advantage? If I’m going to move up an abstraction layer, why not give me natural language? Lots of things turn a limited natural language grammar into SQL for you. What is JSON going to {do: for: {me}}?
replies(2): >>44012372 #>>44013758 #
Spivak ◴[] No.44013758[source]
I find it funny people are making fun of this while every ORM builds up an object representing the query and then compiles it to SQL. SQL but as a data structure you can manipulate has thousands of implementations because it solves a real problem. This time it's because LLMs have an easier time outputting complex JSON than SQL itself.
replies(2): >>44014898 #>>44021201 #
1. ses1984 ◴[] No.44014898[source]
Any idea why that is?