Getting AI to write good SQL

(cloud.google.com)

Show context

tango12 ◴[16 May 25 23:05 UTC] No.44010584[source]▶

What’s the eventual goal of text to sql?

Is it to build a copilot for a data analyst or to get business insight without going through an analyst?

If it’s the latter - then imho no amount of text to sql sophistication will solve the problem because it’s impossible for a non analyst to understand if the sql is correct or sufficient.

These don’t seem like text2sql problems:

> Why did we hit only 80% of our daily ecommmerce transaction yesterday?

> Why is customer acquisition cost trending up?

> Why was the campaign in NYC worse than the same in SF?

replies(5): >>44010646 #>>44010660 #>>44010746 #>>44010772 #>>44011353 #

1. richardw ◴[16 May 25 23:40 UTC] No.44010772[source]▶

>>44010584 #

Any algo that a human would follow can be built and tested. If you have 10 analysts you have 10 different skill levels, with differing understanding of the database and business context. So automation gives you a platform to achieve a floor of skill and knowledge. The humans can now be “at least this good or better”. A new analyst instantly gets better, faster.

I assume a useful goal would be to guide development of the system in coordination with experts, test it, have the AI explain all trade offs, potential bugs, sense check it against expected results etc.

Taste is hard to automate. Real insight is hard to automate. But a domain expert who isn’t an “analyst” can go extremely far with well designed automation and a sense of what rational results should look like. Obviously the state of the art isn’t perfect but you asked about goals, so those would be my goals.

replies(1): >>44010901 #

2. layer8 ◴[17 May 25 00:05 UTC] No.44010901[source]▶

>>44010772 (TP) #

But “text to sql” isn’t an algorithm.

replies(1): >>44011080 #

3. richardw ◴[17 May 25 00:35 UTC] No.44011080[source]▶

>>44010901 #

The processes the people want the sql for are likely filled with algo’s. An exec wants info in a known domain, set up a text to sql system with lots of context and testing to generate queries. If they think they have something good, get an expert to test and productionise it.

“Thank you for your request. Can you walk me through the steps you’d use to do this manually? What things would you watch out for? What kind of number ranges are reasonable? I can propose an algorithm and you tell me if that’s correct. The admins have set up guidelines on how to reason about customer and purchase data. Is the following consistent with your expectations?”

replies(1): >>44011142 #

4. layer8 ◴[17 May 25 00:47 UTC] No.44011142{3}[source]▶

>>44011080 #

This is the same fallacy as low-code/no-code. If you have to check a precise algorithm, you’re effectively coding, and you need a language with the same precision as a programming language.

replies(1): >>44012160 #

5. richardw ◴[17 May 25 05:06 UTC] No.44012160{4}[source]▶

>>44011142 #

Only if you want a production-ready output. To get execs able to self-feed enough, this works fine. Look, you don’t see value until it’s perfect. Good, other people do. I see your fallacy and raise you a false dichotomy.

replies(1): >>44014347 #

6. layer8 ◴[17 May 25 13:53 UTC] No.44014347{5}[source]▶

>>44012160 #

The problem I see is how do you verify that the result of your text-to-sql is really what you were asking for, without understanding the SQL (or “the algorithm”)? It boils down to that you have to know what you are doing, and with the present state of art of AI we can’t have confidence in that.

replies(1): >>44085139 #

7. richardw ◴[25 May 25 02:38 UTC] No.44085139{6}[source]▶

>>44014347 #

I’m assuming exploratory work from the exec, not something they make decisions with or put into production. If you need something you can trust, you typically need a lot of checks, including multiple humans.

I play a weird part at work near AI. I use it all the time but I’m the first person to warn everyone that it’s absolutely not trustworthy. No matter your prompt, the data, the guidelines built into it, the output is fundamentally flaky. But I use it while knowing that and working around it. Making the process reliable is a big part of my focus, and usually that means minimising the part the LLM plays. Checks and balances live where things are predictable.

↑