←back to thread

304 points nimbusega | 1 comments | | HN request time: 0.258s | source
Show context
nimbusega ◴[] No.42067000[source]
I made this to experiment with embeddings and explore how different ways of displaying information affect your perception.

It gets the top 100 stories, sends their html to GPT-4 to extract the main content (this was not producing good enough results with html parsing) and then gets an embedding using the title and content.

Likes/dislikes are stored in local storage and compared against all stories using cosine similarity to find the most relevant stories.

It costs about $10/day to run. I was thinking of offering additional value for a small subscription. Maybe more pages of the newspaper, full story content/comments, a weekly digest or ePub export or something?

replies(4): >>42067307 #>>42067813 #>>42072116 #>>42072371 #
1. tagawa ◴[] No.42072371[source]
Nice – I like this a lot. I feel like I'd use this for slow-lane reading and the original HN site when I'm in a rush.

Regarding HTML to GPT-4, I seem to remember commenters here saying they got better results by converting the HTML to Markdown first, then sending to an LLM. Might save a bit of money too.