←back to thread

Getting AI to write good SQL

(cloud.google.com)
478 points richards | 1 comments | | HN request time: 0.2s | source
Show context
mritchie712 ◴[] No.44010031[source]
the short answer: use a semantic layer.

It's the cleanest way to give the right context and the best place to pull a human in the loop.

A human can validate and create all important metrics (e.g. what does "monthly active users" really mean) then an LLM can use that metric definition whenever asked for MAU.

With a semantic layer, you get the added benefit of writing queries in JSON instead of raw SQL. LLM's are much more consistent at writing a small JSON vs. hundreds of lines of SQL.

We[0] use cube[1] for this. It's the best open source semantic layer, but there's a couple closed source options too.

My last company wrote a post on this in 2021[2]. Looks like the acquirer stopped paying for the blog hosting, but the HN post is still up.

0 - https://www.definite.app/

1 - https://cube.dev/

2 - https://news.ycombinator.com/item?id=25930190

replies(7): >>44010358 #>>44011108 #>>44011775 #>>44011802 #>>44012638 #>>44013043 #>>44013772 #
1. jinjin2 ◴[] No.44013043[source]
I agree that using a semantic layer is the best way to get better precision. It is almost like a cheatsheet for the AI.

But I would never use one that forced me to express my queries in JSON. The best implementations integrate right into the database so they become an integral part of regular your SQL queries, and as such also available to all your tools.

In my experience, from using the Exasol Semantic Layer, it can be a totally seamless experience.