Most active commenters
  • dylan604(5)
  • padjo(3)
  • auxiliarymoose(3)

←back to thread

Element: setHTML() method

(developer.mozilla.org)
167 points todsacerdoti | 20 comments | | HN request time: 0.201s | source | bottom
1. dzogchen ◴[] No.45675208[source]
Neat. I think once this is adopted by HTMX (or similar libraries) you don't need to sanitize on the server side anymore?
replies(1): >>45675272 #
2. dylan604 ◴[] No.45675272[source]
Do you honestly feel that we will ever be in a place for the server to not need to sanitize data from the client? Really? I don't. Any suggestion to me of "not needing to sanitize data from client" will immediately have me thinking the person doing the suggesting is not very good at their job, really new, or trying to scam me.

There's no reason to not sanitize data from the client, yet every reason to sanitize it.

replies(4): >>45675347 #>>45675432 #>>45675693 #>>45676358 #
3. jsmith99 ◴[] No.45675347[source]
It's arguably easier just to sanitise at display time otherwise you have problems like double escaping.
replies(1): >>45675653 #
4. strbean ◴[] No.45675432[source]
It can be a complicated and error-prone process, mainly in scenarios where you have multiple mediums that require different sanitizers. Obviously you should do it. But in such scenarios, the best practice is to sanitize as close to the place it is used as possible. I've seen terrible codebases where they tried to apply multiple layers of sanitization on user input before storing to the DB, then reverse the unneeded layers before output. Obviously this didn't work.

Point being, if you can move sanitization even closer to where it is used, and that sanitization is actually provided by the standard library of the platform in question, that's a massive win.

replies(2): >>45675650 #>>45676516 #
5. immibis ◴[] No.45675650{3}[source]
By "sanitise" what's really meant is usually "escape". User typed their display name as <script>. You want the screen to say their display name, which is <script>. Therefore you send &lt;script&gt;. That's not their display name - that's just what you write in HTML to get their display name to appear on the screen. You shouldn't store it in the database in the display_name column.
replies(1): >>45675922 #
6. bpt3 ◴[] No.45675653{3}[source]
Easier does not mean better, which seems to be true in this case given the many, many vulnerabilities that have been exploited over the years due to a lack of input sanitization.
replies(1): >>45675823 #
7. padjo ◴[] No.45675693[source]
Sanitize as close as possible to where it is used is usually best, then you don’t have to keep track of what’s sanitized and what’s not sanitized for very long.

(Especially important if sanitation is not idempotent!)

8. padjo ◴[] No.45675823{4}[source]
In this case easier is actually better. Sanitize a string at the point where you are going to use it. The locality makes it easy to verify that sanitation has been done correctly for the context. The alternative means you have to maintain a chain of custody for the string and ensure it is safe.
replies(1): >>45676451 #
9. strbean ◴[] No.45675922{4}[source]
Agreed. The codebase I'm thinking of was html encoding stuff before storing it, then when they needed to e.g. send an SMS, trying to remember to decode. Terrible.
10. auxiliarymoose ◴[] No.45676358[source]
If you sanitize on the server, you are making assumptions about what is safe/unsafe for your clients. It's possible to make these assumptions correctly, but that requires keeping them in sync with all clients which is hard to do correctly.

Something that's sanitized from an HTML standpoint is not necessarily sanitized for native desktop & mobile applications, client UI frameworks, etc. For example, with Cloudflare's CloudBleed security incident, malformed img tags sent by origin servers (which weren't themselves by themselves unsafe in browsers) caused their edge servers to append garbage (including miscellaneous secure data) from heap memory to some requests that got indexed by search engines.

Sanitization is always the sole responsibility of the consumer of the content to make sure it presents any inbound data safely. Sometimes the "consumer" is colocated on the server (e.g. for server rendered HTML + no native/API users) but many times it's not.

replies(1): >>45676498 #
11. dylan604 ◴[] No.45676451{5}[source]
if you are using it at the client, sure, but then why is the server involved? if you are sending it to the server, you need to treat it like it is always coming from a hacker with very bad intentions. i don't care where the data comes from, my server will sanitize it for its own protection. after all, just because it left "clean" from your browser does not mean it was not interfered with elsewhere upstream TLS be damned. if we've double encoded something, that's fine, it won't blow up the server. at the end of that day, that's what is most important. if some double decoding doesn't happen correctly on the client, then <shrugEmoji>
replies(1): >>45679264 #
12. dylan604 ◴[] No.45676498{3}[source]
> If you sanitize on the server, you are making assumptions about what is safe/unsafe for your clients.

No. I'm making decisions on what is safe for my server. I'm a back end guy, I don't really care about your front end code. I will never deem your front end code's requests as trustworthy. If the front end code cannot properly handle encoding, the back end code will do what it needs to do to not allow stupid string injection attacks. I don't know where your request has been. Just because you think it came from your code in the browser does not mean that was the last place it was altered before hitting the back end.

replies(1): >>45676709 #
13. dylan604 ◴[] No.45676516{3}[source]
You're making a bad assumption that client side code was the last place the submitted string was altered in the path to the server. The man in the middle might have a different idea and should always be protected against on the server where it is the last place to sanitize it.
14. auxiliarymoose ◴[] No.45676709{4}[source]
How can user input be unsafe on the server? Are you evaluating it somehow?

User-generated content shouldn't be trusted in that way (inbound requests from client, data fields authored by users, etc.)

replies(1): >>45676826 #
15. dylan604 ◴[] No.45676826{5}[source]
Is that a serious question?

INSERT INTO table (user_name) VALUES ...

Are you one of today's 10000 on server side sanitizing of user data?

replies(2): >>45676888 #>>45676944 #
16. krapp ◴[] No.45676888{6}[source]
Are you one of today's 10000 on using parameterized queries and prepared statements?

Unless you're doing something stupid like concatenating strings into SQL queries, there's no need to "sanitize" anything going into a database. SQL injection is a solved problem.

Coming from the database and sending to the client, sure. But unless you're doing something stupid like concatenating strings into SQL statements it hasn't been necessary to "sanitize" data going into a database in ages.

Edit: I didn't realize until I reread this comment that I repeated part of it twice, but I'm keeping it in because it bears repeating.

replies(1): >>45677125 #
17. auxiliarymoose ◴[] No.45676944{6}[source]
Communicating with a SQL driver by concatenating strings containing user input and then evaluating it? wat?

I'm very interested in what tech stack you are using where this is a problem.

replies(1): >>45677146 #
18. hoppp ◴[] No.45677125{7}[source]
SQL injection is solved if you use dependencies that solve it of course.

Other than SQL injection there is command or log injection, file names need to be sanitized or any user uploaded content for XSS and that includes images. Any incoming JSON data should be sanitized, extra fields removed etc.

Log injection is a pretty nasty sort of hack that depending on how the logs are processed can lead to XSS or Command injection

19. jfengel ◴[] No.45677146{7}[source]
People do it all the time, on any tech stack that lets you execute command strings. A lot of of early databases didn't even support things like parameterized inserts.
20. padjo ◴[] No.45679264{6}[source]
Yeah as an Irish person with an apostrophe in their name this attitude is why my name routinely gets mangled or I get told my name is invalid.

You don’t escape input. You safely store it in the database and then sanitize it at the point where you’re going to use it.