Getting AI to write good SQL

(cloud.google.com)

501 points richards | 1 comments | 16 May 25 21:10 UTC | HN request time: 0s | source

Show context

zeroq ◴[17 May 25 03:04 UTC] No.44011740[source]▶

Every once in a while I've been trying AI, since everyone and their mother told me to, so I comply.

My recent endevour was with Gemini 2.5:

  - Write me a simple todo app on cloudflare with auth0 authentication.
  - Here's a simple todo on cloudflare. We import the @auth0-cloudflare and...
  - Does that @auth0-cloudflare exists?
  - Oh, it doesn't. I can give you a walkthrough on how to set up an account on auth0. Would you like me to?
  - Yes, please.
  - Here. I'm going to write the walkthrough in a document... (proceed to create an empty document)
  - That seems to be an empty document.
  - Oh, my bad. I'll produce it once more. (proceed to create another empty document)
  - Seems like you're md parsing library is broken, can you write it in chat instead?
  - Yes... (your gemini trial has expired, would you like to pay $100 to continue?)

replies(2): >>44012230 #>>44012287 #

karencarits ◴[17 May 25 05:45 UTC] No.44012287[source]▶

>>44011740 #

It's difficult to assess how typical your experience is; I tried your initial prompt (`Write me a simple todo app on cloudflare with auth0 authentication.` on gemini-2.5-pro-preview-05-06) and didn't get any mentions of @auth0-cloudfare, although I cannot verify if the answer is working as-is

https://pastebin.com/yfg0Zn0u

replies(1): >>44013210 #

__loam ◴[17 May 25 09:48 UTC] No.44013210[source]▶

>>44012287 #

Shocked you got a different output from the stochastic token generator.

replies(1): >>44015045 #

1. karencarits ◴[17 May 25 15:35 UTC] No.44015045[source]▶

>>44013210 #

That's not the point. While there is a temperature setting and randomness involved, you can still benchmark and experience significant differences in the output between models and generations. I thus provided more details and the full output to make it easier for people to assess the context of the comment I replied to

When someone uses the same tools as I do but seem to experience problems I do not have - these kind of posts often describes how bad LLMs are or how bad Google search is - I get a bit confused. Is it A/B testing going on? Am I just lucky? Am I inattentive to these weaknesses? Is it about promoting? Or what areas we work in? Do we actually use the same tools (i.e., same models)?

↑