Most active commenters
  • Izkata(5)
  • (4)
  • masklinn(3)
  • zahlman(3)

←back to thread

620 points tambourine_man | 43 comments | | HN request time: 0.879s | source | bottom
Show context
serbuvlad ◴[] No.43750075[source]
All things considered, this is pretty cool. Basically, this replaces

    db.execute("QUERY WHERE name = ?", (name,))
with

    db.execute(t"QUERY WHERE name = {name}")
Does the benefit from this syntactic sugar outweigh the added complexity of a new language feature? I think it does in this case for two reasons:

1. Allowing library developers to do whatever they want with {} expansions is a good thing, and will probably spawn some good uses.

2. Generalizing template syntax across a language, so that all libraries solve this problem in the same way, is probably a good thing.

replies(12): >>43750226 #>>43750250 #>>43750260 #>>43750279 #>>43750513 #>>43750750 #>>43752117 #>>43752173 #>>43752293 #>>43754738 #>>43756560 #>>43763190 #
1. benwilber0 ◴[] No.43752173[source]
Aren't there other benefits to server-side parameter binding besides just SQL-injection safety? For instance, using PG's extended protocol (binary) instead of just raw SQL strings. Caching parameterized prepared statements, etc.

Also:

    db.execute(t"QUERY WHERE name = {name}")
Is dangerously close to:

    db.execute(f"QUERY WHERE name = {name}")

A single character difference and now you've just made yourself trivially injectible.

I don't think this new format specifier is in any way applicable to SQL queries.

replies(12): >>43752236 #>>43752283 #>>43752331 #>>43752336 #>>43752358 #>>43752859 #>>43753280 #>>43753699 #>>43754372 #>>43754646 #>>43755330 #>>43756720 #
2. ◴[] No.43752236[source]
3. VWWHFSfQ ◴[] No.43752283[source]
> I don't think this new format specifier is in any way applicable to SQL queries.

Agree. And the mere presence of such a feature will trigger endless foot-gunning across the Python database ecosystem.

4. masklinn ◴[] No.43752331[source]
> Aren't there other benefits to server-side parameter binding besides just SQL-injection safety? For instance, using PG's extended protocol (binary) instead of just raw SQL strings. Caching parameterized prepared statements, etc.

All of which can be implemented on top of template strings.

> A single character difference and now you've just made yourself trivially injectible.

It's not just a one character difference, it's a different type. So `db.execute` can reject strings both statically and dynamically.

> I don't think

Definitely true.

> this new format specifier is in any way applicable to SQL queries.

It's literally one of PEP 750's motivations.

replies(7): >>43752391 #>>43752395 #>>43752558 #>>43752752 #>>43754441 #>>43755649 #>>43755673 #
5. MR4D ◴[] No.43752336[source]
Dang! Thanks for pointing this out.

I had to look SEVERAL times at your comment before I noticed one is an F and the other is a T.

This won’t end well. Although I like it conceptually, this few pixel difference in a letter is going to cause major problems down the road.

replies(1): >>43752552 #
6. ◴[] No.43752358[source]
7. VWWHFSfQ ◴[] No.43752391[source]
> It's literally one of PEP 750's motivations.

Python is notorious for misguided motivations. We're not "appealing to authority" here. We're free to point out when things are goofy.

8. willcipriano ◴[] No.43752395[source]

    from string.templatelib import Template

    def execute(query: Template)
Should allow for static analysis to prevent this issue if you run mypy as part of your pr process.

That would be in addition to doing any runtime checks.

replies(1): >>43752563 #
9. pphysch ◴[] No.43752552[source]
How? tstrings and fstrings are literals for completely different types.

CS has survived for decades with 1 and 1.0 being completely different types.

replies(3): >>43753130 #>>43753177 #>>43754544 #
10. woodrowbarlow ◴[] No.43752558[source]
nitpicking:

> It's not just a one character difference, it's a different type. So `db.execute` can reject strings both statically and dynamically.

in this case, that's not actually helpful because SQL statements don't need to have parameters, so db.execute will always need to accept a string.

replies(2): >>43753027 #>>43754776 #
11. benwilber0 ◴[] No.43752563{3}[source]
The first mistake we're going to see a library developer make is:

    def execute(query: Union[str, Template]):

Maybe because they want their execute function to be backwards compatible, or just because they really do want to allow either raw strings are a template string.
replies(1): >>43754807 #
12. tczMUFlmoNk ◴[] No.43752752[source]
> > I don't think

> Definitely true.

The rest of your comment is valuable, but this is just mean-spirited and unnecessary.

13. WorldMaker ◴[] No.43752859[source]
Templates are a very different duck type from strings and intentionally don't support __str__(). The SQL tool can provide a `safe_execute(Template)` that throws if passed a string and not a Template. You can imagine future libraries that only support Template and drop all functions that accept strings as truly safe query libraries.

> Caching parameterized prepared statements, etc.

Templates give you all the data you need to also build things like cacheable parameterized prepared statements. For DB engines that support named parameters you can even get the interpolation expression to auto-name parameters (get the string "name" from your example as the name of the variable filling the slot) for additional debugging/sometimes caching benefits.

14. anamexis ◴[] No.43753027{3}[source]
You can just pass it a template with no substitutions.
15. Izkata ◴[] No.43753130{3}[source]
Because they're both passed to "execute", which can't tell between the f-string and a non-interpolated query, so it just has to trust you did the right thing. Typoing the "t" as an "f" introduces SQL injection that's hard to spot.
replies(2): >>43753238 #>>43756782 #
16. Certhas ◴[] No.43753177{3}[source]
I had an extended debugging session last week that centered on 1 and 1. confusion in a library I have to use...
replies(1): >>43756918 #
17. vlovich123 ◴[] No.43753238{4}[source]
Assuming `execute` takes both. You could have `execute(template)` and `execute_interpolated(str, ...args)` but yeah if it takes both you'll have challenges discouraging plain-text interpolation.
replies(1): >>43753703 #
18. ◴[] No.43753280[source]
19. rastignack ◴[] No.43753699[source]
Quite easy to detect with a proper linter.
20. Izkata ◴[] No.43753703{5}[source]
It would have to be the other way around or be a (possibly major) breaking change. Just execute() with strings is already standard python that all the frameworks build on top of, not to mention tutorials:

https://docs.python.org/3/library/sqlite3.html

https://www.psycopg.org/docs/cursor.html

https://dev.mysql.com/doc/connector-python/en/connector-pyth...

replies(1): >>43769181 #
21. hombre_fatal ◴[] No.43754372[source]
You solve that with an execute(stmt) function that requires you to pass in a template.

In Javascript, sql`where id = ${id}` is dangerously close to normal string interpolation `where id = ${id}`, and db libs that offer a sql tag have query(stmt) fns that reject strings.

22. rangerelf ◴[] No.43754441[source]
>> I don't think >Definitely true.

I thought we left middle-school playground tactics behind.

23. MR4D ◴[] No.43754544{3}[source]
Reread my comment. It’s about noticing you have an “f” or a “t” and both are very similar characters.
replies(1): >>43754644 #
24. rocha ◴[] No.43754644{4}[source]
Yes, but you will get an error since string and templates are different types and have different interfaces.
replies(1): >>43755714 #
25. InstaPage ◴[] No.43754646[source]
t vs f going to be hard to spot.
replies(1): >>43754771 #
26. acdha ◴[] No.43754771[source]
This is true of many other things, which is why we have type checkers and linters to be perfectly rigorous rather than expecting humans to never make mistakes.
replies(1): >>43769324 #
27. masklinn ◴[] No.43754776{3}[source]
> db.execute will always need to accept a string.

No. A t-string with no placeholders is perfectly fine. You can use that even if you have no parameters.

28. masklinn ◴[] No.43754807{4}[source]
> they really do want to allow either raw strings are a template string.

I’d consider that an invalid use case:

1. You can create a template string without placeholders.

2. Even if the caller does need to pass in a string (because they’re executing from a file, or t-strings don’t support e.g. facetting) then they can just… wrap the string in a template explicitly.

29. kazinator ◴[] No.43755330[source]
But t"..." and f"..." have different types; we can make db.execute reject character strings and take only template objects.
replies(1): >>43764175 #
30. ◴[] No.43755649[source]
31. davepeck ◴[] No.43755673[source]
> Caching parameterized prepared statements, etc.

I didn’t explicitly mention this in my post but, yes, the Template type is designed with caching in mind. In particular, the .strings tuple is likely to be useful as a cache key in many cases.

32. Izkata ◴[] No.43755714{5}[source]
Click "parent" a few times and look at the code example that started this thread. It's using the same function in a way that can't distinguish whether the user intentionally used a string (including an f-string) and a t-string.
replies(1): >>43756771 #
33. zahlman ◴[] No.43756720[source]
> A single character difference and now you've just made yourself trivially injectible.

No; a single character difference and now you get a `TypeError`, which hopefully the library has made more informative by predicting this common misuse pattern.

34. zahlman ◴[] No.43756771{6}[source]
Yes, and the parent is misguided. As was pointed out in multiple replies, the library can distinguish whether an ordinary string or a t-string is passed because the t-string is not a string instance, but instead creates a separate library type. A user who mistakenly uses an f prefix instead of a t prefix will, with a properly designed library, encounter a `TypeError` at runtime (or a warning earlier, given type annotations and a checker), not SQL injection.
replies(1): >>43758123 #
35. zahlman ◴[] No.43756782{4}[source]
`execute` can tell the difference, because `t"..."` does not create the same type of object that `f"..."` does.
36. pphysch ◴[] No.43756918{4}[source]
Yeah, it's a real bummer when that happens. I wish JSON never tried to do types.
37. Izkata ◴[] No.43758123{7}[source]
In this particular instance it can't, because there are 3 ways in question here, and it can't distinguish between correct intentional usage and accidental usage of an f-string instead of a t-string:

  db.execute("SELECT foo FROM bar;")
  db.execute(f"SELECT foo FROM bar WHERE id = {foo_id};")
  db.execute(t"SELECT foo FROM bar WHERE id = {foo_id};")
The first and second look identical to execute() because all it sees is a string. But the second one is wrong, a hard-to-see typo of the third.

If f-strings didn't exist there'd be no issue because it could distinguish by type as you say. But we have an incorrect SQL-injection-prone usage here that can't be distinguished by type from the correct plain string usage.

replies(1): >>43764565 #
38. HackerThemAll ◴[] No.43764175[source]
Yeah that would be a backward compatible way to do stuff.
39. Timon3 ◴[] No.43764565{8}[source]
There is no reason to support the first or second usage. It's totally fine to always require a t-string:

    db.execute(t"SELECT foo FROM bar;")
See? No reason to accept strings, it's absolutely fine to always error if a string is passed.
replies(1): >>43765327 #
40. Izkata ◴[] No.43765327{9}[source]
My (and their) point is that's the already existing API. You're proposing a big breaking change, with how many frameworks and tutorials are built on top of that.
replies(1): >>43765451 #
41. Timon3 ◴[] No.43765451{10}[source]
It's not like this is the first time APIs have been improved. There are many tools (e.g. deprecation warnings & hints in editors, linter rules) that can help bridge the gap - even if t-strings are only used for new or refactored code, it's still a big improvement!

There's also simply no hard requirement to overload an `execute` function. We have options beyond "no templates at all" and "execute takes templates and strings", for example by introducing a separate function. Why does perfect have to be the enemy of good here?

42. dragonwriter ◴[] No.43769181{6}[source]
> It would have to be the other way around or be a (possibly major) breaking change.

If it is going to reject the currently-accepted unsafe usage, its going to be a major breaking change in any case, so I don't see the problem. I mean, if you are lamenting it can't reject the currently-accepted SQL-interpolated-via-f-string because it can't distinguish it by type from plain strings with no interpolation, you are already saying that you want a major breaking change but are upset because the particular implementation you want is not possible. So you can't turn around and dismiss an alternative solution because it would be a major breaking change, that's what was asked for!

43. PennRobotics ◴[] No.43769324{3}[source]
and syntax highlighting