←back to thread

684 points prettyblocks | 8 comments | | HN request time: 0.216s | source | bottom

I mean anything in the 0.5B-3B range that's available on Ollama (for example). Have you built any cool tooling that uses these models as part of your work flow?
Show context
antonok ◴[] No.42786841[source]
I've been using Llama models to identify cookie notices on websites, for the purpose of adding filter rules to block them in EasyList Cookie. Otherwise, this is normally done by, essentially, manual volunteer reporting.

Most cookie notices turn out to be pretty similar, HTML/CSS-wise, and then you can grab their `innerText` and filter out false positives with a small LLM. I've found the 3B models have decent performance on this task, given enough prompt engineering. They do fall apart slightly around edge cases like less common languages or combined cookie notice + age restriction banners. 7B has a negligible false-positive rate without much extra cost. Either way these things are really fast and it's amazing to see reports streaming in during a crawl with no human effort required.

Code is at https://github.com/brave/cookiemonster. You can see the prompt at https://github.com/brave/cookiemonster/blob/main/src/text-cl....

replies(4): >>42786891 #>>42786896 #>>42793119 #>>42793157 #
1. GardenLetter27 ◴[] No.42793119[source]
It's funny that this is even necessary though - that great EU innovation at work.
replies(3): >>42794055 #>>42795154 #>>42796348 #
2. kalaksi ◴[] No.42794055[source]
Tracking, tracking cookies, banners etc. are a choice done by the website. There are browser addons for making it simpler, though.

The transparency requirements and consent for collecting all kinds of PII (this is the regulation) actually is a great innovation.

replies(1): >>42794440 #
3. docmars ◴[] No.42794440[source]
I think I'd rather see cookie notices handled by a browser API with a common UI, where the default is always "No." Provide that common UI in a popover accessed in the address bar, or a side pane in the browser itself.

If a user logs in or does something requiring cookies that would otherwise prevent normal functionality, prompt them with a Permissions box if they haven't already accepted it in the usual (optional) UI.

replies(2): >>42794593 #>>42797274 #
4. kalaksi ◴[] No.42794593{3}[source]
Cookies for normal functionality don't require consent anyway.

But yes, I think just about everybody would like the UX you described. But the entities that track you don't want to make it that easy. You probably know of the do-not-track header too.

5. pornel ◴[] No.42795154[source]
The legislation has been watered down by lobbying of the trillion-dollar tracking industry.

The industry knows ~nobody wants to be tracked, so they don't want to let tracking preferences to be easy to express. They want cookie notices to be annoying to make people associate privacy with a bureaucratic nonsense, and stop demanding to have privacy.

There was P3P spec in 2002: https://www.w3.org/TR/P3P/

It even got decent implementation in Internet Explorer, but Google has been deliberately sending a junk P3P header to bypass it.

It has been tried again with a very simple DNT spec. Support for it (that barely existed anyway) collapsed after Microsoft decided to make Do-Not-Track on by default in Edge.

6. vvillena ◴[] No.42796348[source]
Bear in mind, those arcane cookie forms are probably not compliant with EU laws. If there's not a "reject" button next to the "accept" button, the form is almost definitely not to spec.
7. YetAnotherNick ◴[] No.42797274{3}[source]
There isn't any way EU didn't knew this was possible and is a better choice. There already was DNT header that they can regulate. It also knew the harm to ad industry.
replies(1): >>42797544 #
8. Fraaaank ◴[] No.42797544{4}[source]
There isn't any rule that requires websites to use a cookie banner. Your required to obtain explicit consent before reading/setting any cookies that aren't strictly necessary. The web came up with the cookie banner.

Google could've implemented a consent API in Chrome, but they didn't. Guess why.