But mostly I'm just happy that it's finally here, I do appreciate all the hard work people been doing to get this live.
Emphasis mine. I do not understand this design choice. If I explicitly allow `script` tag, why should it be stripped?
If the method was called setXSSSafeSubsetOfHTML sure I guess, but feels weird for setHTML to have impossible-to-override filter.
There's no reason to not sanitize data from the client, yet every reason to sanitize it.
Meanwhile, there's "setHTMLUnsafe()" and, of course, good old .innerHTML.
Point being, if you can move sanitization even closer to where it is used, and that sanitization is actually provided by the standard library of the platform in question, that's a massive win.
They empty their contents into the new parent when they're appended, so they can't be meaningfully appended a second time without rebuilding them.
`<template>` is mean to be reused, since you're meant to clone it in order to use it, and then you can clone it again.
While lit-html templates are already XSS-hardened because template strings aren't forgeable, we do have utilities like `unsafeHTML()` that let you treat untrusted strings as HTML, which are currently... unsafe.
With `Element.setHTML()` we can make a `safeHTML()` directive and let the developer specify sanitizer options too.
I think a config object in which you define for script options like sanitization and other script configuration might be helpful.
After all, there almost always need to be backward compatibility be ensured, and this might work. I am no spec guy, it is just an idea. React makes use of "use client/server", so this would be more central and explicit.
Something like "setSafeHTML()" would be preferable. (Since it's Mozilla, there should be a few committee meetings to come up with the appropriate name)...
Something that's sanitized from an HTML standpoint is not necessarily sanitized for native desktop & mobile applications, client UI frameworks, etc. For example, with Cloudflare's CloudBleed security incident, malformed img tags sent by origin servers (which weren't themselves by themselves unsafe in browsers) caused their edge servers to append garbage (including miscellaneous secure data) from heap memory to some requests that got indexed by search engines.
Sanitization is always the sole responsibility of the consumer of the content to make sure it presents any inbound data safely. Sometimes the "consumer" is colocated on the server (e.g. for server rendered HTML + no native/API users) but many times it's not.
No. I'm making decisions on what is safe for my server. I'm a back end guy, I don't really care about your front end code. I will never deem your front end code's requests as trustworthy. If the front end code cannot properly handle encoding, the back end code will do what it needs to do to not allow stupid string injection attacks. I don't know where your request has been. Just because you think it came from your code in the browser does not mean that was the last place it was altered before hitting the back end.
<p>Hello <scr<script>ipt>alert(1)</scr<script>ipt> World</p>
The program outputs: $ node .
<p>Hello <script>alert(1)</script> World</p>
{
sanitizedHTML: '<p>Hello <script>alert(1)</script> World</p>',
wasModified: true,
removedElements: [],
removedAttributes: []
}
Asking a chatbot to make a security function and then posting it for others to use without even reviewing it is not only disrespectful, but dangerous and grossly negligent. Please take this down.Two, even if we did, DOMPurify is ~2.7x bigger than lit-html core (3.1Kb minzipped), and the unsafeHTML() directive is less than 400 bytes minzipped. It's just really big to take on a sanitizer, and which one to use is an opinion we'd have to have. And lit-html is extensible and people can already write their own safeHTML() directive that uses DOMPurify.
For us it's a lot simpler to have safe templates, an unsafe directive, and not parse things to finely in between.
A built-in API is different for us though. It's standard, stable, and should eventually be well known by all web developers. We can't integrate it with no extra dependencies or code, and just adopt the standard platform options.
User-generated content shouldn't be trusted in that way (inbound requests from client, data fields authored by users, etc.)
https://ibrahimtanyalcin.github.io/Cahir/
the whole rendering uses a single fragment.
Unless you're doing something stupid like concatenating strings into SQL queries, there's no need to "sanitize" anything going into a database. SQL injection is a solved problem.
Coming from the database and sending to the client, sure. But unless you're doing something stupid like concatenating strings into SQL statements it hasn't been necessary to "sanitize" data going into a database in ages.
Edit: I didn't realize until I reread this comment that I repeated part of it twice, but I'm keeping it in because it bears repeating.
I'm very interested in what tech stack you are using where this is a problem.
Other than SQL injection there is command or log injection, file names need to be sanitized or any user uploaded content for XSS and that includes images. Any incoming JSON data should be sanitized, extra fields removed etc.
Log injection is a pretty nasty sort of hack that depending on how the logs are processed can lead to XSS or Command injection
---
Libraries can surely do the same job, but then the exact behavior would vary among a sea of those libs. Having specs defined [0] for such an interface would hopefully iron out much of these variations, as well as enabling some performance gains.
[0]: https://wicg.github.io/sanitizer-api/#dom-element-sethtml
That ship has long since sailed. Browsers are so complex that it takes quite some effort to support the various levels of 9s of the percentage of compatibility with standards, not to mention the browser makers themselves define many of the standards.
When you have 2 of something and one is safe/better and the other one is known to be problematic, you give the awkward name to the problematic one and the obvious name to the safe/better one. Noobs oughtn’t to be attempting the other one, and anyone who is mature enough to have reason to do it, are mature enough to appreciate the reason behind that complexity.
I’d’ve made it a runtime error to call setHTML with an unsafe config, but Javascript tends toward implicit reinterpretation rather than erroring-out.