←back to thread

Getting AI to write good SQL

(cloud.google.com)
477 points richards | 6 comments | | HN request time: 1.452s | source | bottom
Show context
tango12 ◴[] No.44010584[source]
What’s the eventual goal of text to sql?

Is it to build a copilot for a data analyst or to get business insight without going through an analyst?

If it’s the latter - then imho no amount of text to sql sophistication will solve the problem because it’s impossible for a non analyst to understand if the sql is correct or sufficient.

These don’t seem like text2sql problems:

> Why did we hit only 80% of our daily ecommmerce transaction yesterday?

> Why is customer acquisition cost trending up?

> Why was the campaign in NYC worse than the same in SF?

replies(5): >>44010646 #>>44010660 #>>44010746 #>>44010772 #>>44011353 #
1. richardw ◴[] No.44010772[source]
Any algo that a human would follow can be built and tested. If you have 10 analysts you have 10 different skill levels, with differing understanding of the database and business context. So automation gives you a platform to achieve a floor of skill and knowledge. The humans can now be “at least this good or better”. A new analyst instantly gets better, faster.

I assume a useful goal would be to guide development of the system in coordination with experts, test it, have the AI explain all trade offs, potential bugs, sense check it against expected results etc.

Taste is hard to automate. Real insight is hard to automate. But a domain expert who isn’t an “analyst” can go extremely far with well designed automation and a sense of what rational results should look like. Obviously the state of the art isn’t perfect but you asked about goals, so those would be my goals.

replies(1): >>44010901 #
2. layer8 ◴[] No.44010901[source]
But “text to sql” isn’t an algorithm.
replies(1): >>44011080 #
3. richardw ◴[] No.44011080[source]
The processes the people want the sql for are likely filled with algo’s. An exec wants info in a known domain, set up a text to sql system with lots of context and testing to generate queries. If they think they have something good, get an expert to test and productionise it.

“Thank you for your request. Can you walk me through the steps you’d use to do this manually? What things would you watch out for? What kind of number ranges are reasonable? I can propose an algorithm and you tell me if that’s correct. The admins have set up guidelines on how to reason about customer and purchase data. Is the following consistent with your expectations?”

replies(1): >>44011142 #
4. layer8 ◴[] No.44011142{3}[source]
This is the same fallacy as low-code/no-code. If you have to check a precise algorithm, you’re effectively coding, and you need a language with the same precision as a programming language.
replies(1): >>44012160 #
5. richardw ◴[] No.44012160{4}[source]
Only if you want a production-ready output. To get execs able to self-feed enough, this works fine. Look, you don’t see value until it’s perfect. Good, other people do. I see your fallacy and raise you a false dichotomy.
replies(1): >>44014347 #
6. layer8 ◴[] No.44014347{5}[source]
The problem I see is how do you verify that the result of your text-to-sql is really what you were asking for, without understanding the SQL (or “the algorithm”)? It boils down to that you have to know what you are doing, and with the present state of art of AI we can’t have confidence in that.