←back to thread

122 points azath92 | 2 comments | | HN request time: 0.409s | source

TLDR: Build a quick HN profile to see how little context LLMs need to personalise your feed. Rate 30 posts once, get a permanent ranked homepage you can return to.

Our goal was to build a tool that allowed us to test a range of "personal contexts" on a very focused everyday use case for us, reading HN!

We are exploring use of personal context with LLMs, specifically what kind of data, how much, and with how much additional effort on the user’s part was needed to get decent results. The test tool was a bit of fun on its own so we re-skinned it and decided to post it here.

First time posting anything on HN but folks at work encouraged me to drop a link. Keen on feedback or other interesting projects thinking about bootstrapping personal context for LLM workflows!

Show context
password4321 ◴[] No.44455331[source]
As far as rating posts: user favorites are public, and you could ask for a copy+paste of a few pages of upvoted stories if someone is not using the favorites feature. The stories that have been commented on are also a pretty strong public signal.
replies(1): >>44455417 #
1. azath92 ◴[] No.44455417[source]
this is an angle we honestly didn't think about (we are pretty much long time lurkers) but accessing existing HN content is a great idea! I didn't even know there was a page of upvoted submissions :) It doesn't look like thats available via the API, but a copy paste of some text should work just aswel, all we pass through is titles and urls to the LLM anyway to generate the profile so its much the same.

More generally a next feature we want for ourselves is a way to add just some generic text and "update" the profile with that, rather than generate it fresh exclusively off of the 30 examples. This circles back to us using this as a focus point to think about what data is enough to generate a good user profile, and what good is.

replies(1): >>44456019 #
2. joseda-hg ◴[] No.44456019[source]
Given the nature of the small pool(And the way they naturally exclude / includes topics), I'd strongly prefer if it had some way of adding more than 30 samples, maybe keep track of each set calibration taken and compare?