I'm not sure how people not already having hit this very issue before is supposed to know about it beforehand though, one of those things that you don't really come across until you're hit by it.
Fun learning new things so often but I never once heard of the public suffix list.
That said, I do know the other best practices mentioned elsewhere
Yes. For instance in circumstances exactly as described in the thread you are commenting in now and the article it refers to.
Services like google's bad site warning system may use it to indicate that it shouldn't consider a whole domain harmful if it considers a small number of its subdomains to be so, where otherwise they would. It is no guarantee, of course.
And it is incredibly valuable thing. You might not think it is, but internet is filled utterly dangerous, scammy, phisy, malwary websites and everyday Safe Browsing (via Chrome, Firefox and Safari - yes, Safari uses Safe Browsing) keeps users safe.
If immich didnt follow best practice that's Google's fault? You're showing your naivety, and bias here.
Google is happy to take their money and show scammy ads. Google ads are the most common vector for fake software support scams. Most people google something like "microsoft support" and end up there. Has Google ever banned their own ad domains?
Google is the last entity I would trust to be neutral here.
In the past, browsers used an algorithm which only denied setting wide-ranging cookies for top-level domains with no dots (e.g. com or org). However, this did not work for top-level domains where only third-level registrations are allowed (e.g. co.uk). In these cases, websites could set a cookie for .co.uk which would be passed onto every website registered under co.uk.
Since there was and remains no algorithmic method of finding the highest level at which a domain may be registered for a particular top-level domain (the policies differ with each registry), the only method is to create a list. This is the aim of the Public Suffix List.
(https://publicsuffix.org/learn/)
So, once they realized web browsers are all inherently flawed, their solution was to maintain a static list of websites.God I hate the web. The engineering equivalent of a car made of duct tape.
Google 90s to 2010 is nothings like Google 2025. There is a reason they removed "Don't be evil" ... being evil and authoritarian makes more money.
Looking at you Manifest V2 ... pour one out for your homies.
They don't need to mention it because they handle it on behalf of the client. Them recommending best practices like using separate domains makes as much sense as them recommending what TLS configs to use.
>or where Apple or Google or Mozilla have a listing hosting best practices that include avoiding false positives by Safe Browsing…
Since were those sites the go to place to learn how to host a site? Apple doesn't offer anything related to web hosting besides "a computer that can run nginx". Google might be the place to ask if you were your aunt and "google" means "internet" to her. Mozilla is the most plausible one because they host MDN, but hosting documentation on HTML/CSS/JS doesn't necessarily mean they offer hosting advice, any more than expecting docs.djangoproject.com to contain hosting advice.
End of random rant.
But then you would loose plattform independency, the main selling point of this atrocity.
Having all those APIs in a sandbox that mostly just work on billion devices is pretty powerful and a potential succesor to HTML would have to beat that, to be adopted.
The best thing to happen, that I can see, is that a sane subset crystalizes, that people start to use dominantly, with the rest becoming legacy, only maintained to have it still working.
But I do dream of a fresh rewrite of the web since university (and the web was way slimmer back then), but I got a bit more pragmatic and I think I understood now the massive problem of solving trusted human communication better. It ain't easy in the real world.
I don't see how that solves the issue that PSL tries to fix. I was a script kiddy hosting neopets phishing pages on free cpanel servers from <random>.ripway.com back in 2007. Browsers were way less capable then.
If amazon shutdown your AWS account, because those same scammers used those domains to host CP rather than phishing pages, would you accept the excuse of "how was I supposed to know?"
1. Immich hosts user content on their domain. And should thus be on the public suffic list.
2. When users host an open source self hosted project like immich, jellyfin, etc. on their own domain it gets flagged as phishing because it looks an awful lot like the publicly hosted version, but it's on a different domain, and possibly a domain that might look suspicious to someone unfamiliar with the project, because it includes the name of the software in the domain. Something like immich.example.com.
The first one is fairly straightforward to deal with, if you know about the public suffix list. I don't know of a good solution for the second though.
I get that SPAM, etc., are an issue, but, like f* google-chrome, I want to browse the web, not some carefully curated list of sites some giant tech company has chosen.
A) you shouldn't be using google-chrome at all B) Firefox should definitely not be using that list either C) if you are going to have a "safe sites" list, that should definitely be a non-profit running that, not an automated robot working for a large probably-evil company...
I was just deploying your_spotify and gave it your-spotify.<my services domain> and there was a warning in the logs that talked about thud, linking the issue:
This all just drives a need to come up with ever more tacked-on protection schemes because browsers have big targets painted on them.
It's browser beware when you do, but you can do it.
I think the giant major downside, is that they've written a rootkit that runs on everything, and to try to make up for that they want to make it so only sites they allow can run.
It's not really very powerful at all if nobody can use it, at that point you are better off just not bothering with it at all.
The Internet may remain, but the Web may really be dead.
For example, if users are supposed to log in on the base account in order to access content on the subdomains, then using the public suffix list would be problematic.
WebUSB I don't use or would miss it right now, but .. the main potential use case is security and it sounds somewhat reasonable
"Use in multi-factor authentication
WebUSB in combination with special purpose devices and public identification registries can be used as key piece in an infrastructure scale solution to digital identity on the internet."
The problem is that at least some of the people maintaining this list seem to be a little trigger happy. And I definitely thing Google probably isn't the best custodian of such a list, as they have obvious conflicts of interest.
But people do use it, like the both of us right now?
People also use maps, do online banking, play games, start complex interactive learning environments, collaborate in real time on documents etc.
All of that works right now.
You have sites now that let you debug microcontrollers on your browser, super cool.
Same thing but with firmware updates in the browser. Cross platform, replaced a mess of ugly broken vendor tools.
This might be what's needed to break out of the current local optimum.
It's not even broken as the edge cases are addressed by ad-hoc solutions.
OP is complaining about global infrastructure not having a pristine design. At best it's a complain over a desirable trait. It's hardly a reason to pull the Jr developer card and mindlessly advocate for throwing everything out and starting over.
This is the first thing i disable in Chrome, Firefox and Edge. The only safe thing they do is safely sending all my browsing history to Google or Microsoft.
This is mostly a browser security mistake but also partly a product of ICANN policy & the design of the domain system, so it's not just the web.
Also, the list isn't really that long, compared to, say, certificate transparency logs; now that's a truly mad solution.
People are reacting as if this list is some kind of overbearing way of tracking what people do on the web - it's almost the opposite of that. It's worth clarifying this is just a suffix list for user-hosted content. It's neither a list of user-hosted domains nor a list of safe websites generally - it's just suffixes for a very small specific use-case: a company providing subdomains. You can think of this as a registry of domain sub-letters.
For instance:
- GitHub.io is on the list but GitHub.com is not - GitHub.com is still considered safe
- I self-host an immich instance on my own domain name - my immich instance isn't flagged & I don't need to add anything to the list because I fully own the domain.
The specific instance is just for Immich themselves who fully own "immich.cloud" but sublet subdomains under it to users.
> *if you are going to have a "safe sites" list"
This is not a safe sites list! This is not even a sites list at all - suffixes are not sites. This also isn't even a "safe" list - in fact it's really a "dangerous" list for browsers & various tooling to effectively segregate security & privacy contexts.
Google is flagging the Immich domain not because it's missing from the safe list but because it has legitimate dangers & it's missing from the dangerous list that informs web clients of said dangers so they can handle them appropriately.
I know the second issue can be a legitimate problem but I feel like the first issue is the primary problem here & the "solution" to the second issue is a remedy that's worse than the disease.
The public suffix list is a great system (despite getting serious backlash here in HN comments, mainly from people who have jumped to wildly exaggerated conclusions about what it is). Beyond that though, flagging domains for phishing for having duplicate content smells like an anti-self-host policy: sure there's phishers making clone sites, but the vast majority of sites flagged are going to be legit unless you employ a more targeted heuristic, but doing so isn't incentivised by Google's (or most company's) business model.
What malicious UGC would you even deliver over this domain? An image with scam instructiins? CSAM isn't even in scope for Safe Browsing, just phishing and malware.
A centralized list like this not just for domains as a whole (e.g. co.uk) but also specific sites (e.g. s3-object-lambda.eu-west-1.amazonaws.com) is both kind of crazy in that the list will bloat a lot over the years, as well as a security risk for any platform that needs this functionality but would prefer not to leak any details publicly.
We already have the concept of a .well-known directory that you can use, when talking to a specific site. Similarly, we know how you can nest subdomains, like c.b.a.x, and it's more or less certain that you can't create a subdomain b without the involvement of a, so it should be possible to walk the chain.
Example:
c --> https://b.a.x/.well-known/public-suffix
b --> https://a.x/.well-known/public-suffix
a --> https://x/.well-known/public-suffix
Maybe ship the domains with the browsers and such and leave generic sites like AWS or whatever to describe things themselves. Hell, maybe that could also have been a TXT record in DNS as well.The fact it's used by one or more browsers in that way is a lawsuit waiting to happen.
Because they, the browsers, are pointing a finger to someone else and accusing them of criminal behavior. That is what a normal user understands this warning as.
Turns out they are wrong. And in being wrong they may well have harmed the party they pointed at, in reputation and / or sales.
It's remarkable how short sighted this is, given that the web is so international. Its not a defense to say some third party has a list, and you're not on it so you're dangerous
Incredible
We live in world where whatever faang adopts is de facto a standard. Accessible these days means google/gmail/facebook/instagram/tiktok works. Everything else is usually forced to follow along.
People will adopt whatever gives them access to their daily dose of doomscrolling and then complain about rather crucial part of their lives like online banking not working.
> And of course, if the new solution completely invalidates old sites, it just won't get picked up.
Old sites don't matter, only high-traffic sites riddled with dark patterns matter. That's the reality, even if it is harsh.
I appreciate the issue it tries to solve but it doesn't seem like a sane solution to me.
You remove that, and videoconferencing (for business or person to person) has to rely on downloading an app, meaning whoever is behind the website has to release for 10-15 OSes now. Some already do, but not everyone has that budget so now there's a massive moat around it.
> But do we need e.g serial port or raw USB access straight from a random website
Being able to flash an IoT (e.g. ESP32) device from the browser is useful for a lot of people. For the "normies", there was also Stadia allowing you to flash their controller to be a generic Bluetooth/usb one on a website, using that webUSB. Without it Google would have had to release an app for multiple OSes, or more likely, would have just left the devices as paperweights. Also, you can use FIDO/U2F keys directly now, which is pretty good.
Browsers are the modern Excel, people complain that they do too much and you only need 20%. But it's a different 20% for everyone.
What do you mean, you can run whatever you want on localhost, and it's quite easy to host whatever you want for whoever you want too. Maybe the biggest modern added barrier to entry is that having TLS is strongly encouraged/even needed for some things, but this is an easily solved problem.
One of the internet's biggest source of scams, phishing, and malware and everything you are complaining about is google adsense.
Google is using the list to bully out competitors, while telling you it's for keeping you safe.
_You_ are showing naivety and bias.