Getting AI to write good SQL

1. tango12 ◴[16 May 25 23:05 UTC] No.44010584[source]▶

What’s the eventual goal of text to sql?

Is it to build a copilot for a data analyst or to get business insight without going through an analyst?

If it’s the latter - then imho no amount of text to sql sophistication will solve the problem because it’s impossible for a non analyst to understand if the sql is correct or sufficient.

These don’t seem like text2sql problems:

> Why did we hit only 80% of our daily ecommmerce transaction yesterday?

> Why is customer acquisition cost trending up?

> Why was the campaign in NYC worse than the same in SF?

replies(5): >>44010646 #>>44010660 #>>44010746 #>>44010772 #>>44011353 #

2. cdavid ◴[16 May 25 23:15 UTC] No.44010646[source]▶

>>44010584 (TP) #

My observation is the latter, but I agree the results fall short of expectations. Business will often want last minute change in reporting, don't get what they want at the right time because lack of analysts, and hope having "infinite speed" will solve the problem.

But ofc the real issue is that if your report metrics change last minute, you're unlikely to get good report. That's a symptom of not thinking much about your metrics.

Also, reports / analysis generally take time because the underlying data are messy, lots of business knowledge encoded "out of band", and poor data infrastructure. The smarter analytics leaders will use the AI push to invest in the foundations.

3. mynegation ◴[16 May 25 23:18 UTC] No.44010660[source]▶

>>44010584 (TP) #

To be fair, these don’t look like SQL problems either. SQL answers “what”, not “why” questions. The goal of text2sql is to free up analyst time to get through “what” much faster and - possibly- focus on “why” questions.

4. phillipcarter ◴[16 May 25 23:35 UTC] No.44010746[source]▶

>>44010584 (TP) #

> These don’t seem like text2sql problems:

Correct, but I would propose two things to add to your analysis:

1. Natural language text is a universal input to LLM systems

2. text2sql makes the foundation of retrieving the information that can help answer these higher-level questions

And so in my mind, the goals for text2sql might be a copilot (near-term), but the long-term is to have a good foundation for automating text2sql calls, comparing results, and pulling them into a larger workflow precisely to help answer the kinds of questions you're proposing.

There's clearly much work needed to achieve that goal.

replies(1): >>44011125 #

5. richardw ◴[16 May 25 23:40 UTC] No.44010772[source]▶

>>44010584 (TP) #

Any algo that a human would follow can be built and tested. If you have 10 analysts you have 10 different skill levels, with differing understanding of the database and business context. So automation gives you a platform to achieve a floor of skill and knowledge. The humans can now be “at least this good or better”. A new analyst instantly gets better, faster.

I assume a useful goal would be to guide development of the system in coordination with experts, test it, have the AI explain all trade offs, potential bugs, sense check it against expected results etc.

Taste is hard to automate. Real insight is hard to automate. But a domain expert who isn’t an “analyst” can go extremely far with well designed automation and a sense of what rational results should look like. Obviously the state of the art isn’t perfect but you asked about goals, so those would be my goals.

replies(1): >>44010901 #

6. layer8 ◴[17 May 25 00:05 UTC] No.44010901[source]▶

>>44010772 #

But “text to sql” isn’t an algorithm.

replies(1): >>44011080 #

7. richardw ◴[17 May 25 00:35 UTC] No.44011080{3}[source]▶

>>44010901 #

The processes the people want the sql for are likely filled with algo’s. An exec wants info in a known domain, set up a text to sql system with lots of context and testing to generate queries. If they think they have something good, get an expert to test and productionise it.

“Thank you for your request. Can you walk me through the steps you’d use to do this manually? What things would you watch out for? What kind of number ranges are reasonable? I can propose an algorithm and you tell me if that’s correct. The admins have set up guidelines on how to reason about customer and purchase data. Is the following consistent with your expectations?”

replies(1): >>44011142 #

8. galenmarchetti ◴[17 May 25 00:44 UTC] No.44011125[source]▶

>>44010746 #

yeah I agree with this - good text2sql is essential but just one part of a larger stack that will actually get there. Seems possible tho

9. layer8 ◴[17 May 25 00:47 UTC] No.44011142{4}[source]▶

>>44011080 #

This is the same fallacy as low-code/no-code. If you have to check a precise algorithm, you’re effectively coding, and you need a language with the same precision as a programming language.

replies(1): >>44012160 #

10. ◴[17 May 25 01:31 UTC] No.44011353[source]▶

>>44010584 (TP) #

11. richardw ◴[17 May 25 05:06 UTC] No.44012160{5}[source]▶

>>44011142 #

Only if you want a production-ready output. To get execs able to self-feed enough, this works fine. Look, you don’t see value until it’s perfect. Good, other people do. I see your fallacy and raise you a false dichotomy.

replies(1): >>44014347 #

12. layer8 ◴[17 May 25 13:53 UTC] No.44014347{6}[source]▶

>>44012160 #

The problem I see is how do you verify that the result of your text-to-sql is really what you were asking for, without understanding the SQL (or “the algorithm”)? It boils down to that you have to know what you are doing, and with the present state of art of AI we can’t have confidence in that.

replies(1): >>44085139 #

13. richardw ◴[25 May 25 02:38 UTC] No.44085139{7}[source]▶

>>44014347 #

I’m assuming exploratory work from the exec, not something they make decisions with or put into production. If you need something you can trust, you typically need a lot of checks, including multiple humans.

I play a weird part at work near AI. I use it all the time but I’m the first person to warn everyone that it’s absolutely not trustworthy. No matter your prompt, the data, the guidelines built into it, the output is fundamentally flaky. But I use it while knowing that and working around it. Making the process reliable is a big part of my focus, and usually that means minimising the part the LLM plays. Checks and balances live where things are predictable.