←back to thread

564 points nimbusega | 2 comments | | HN request time: 0.494s | source
Show context
nimbusega ◴[] No.42067000[source]
I made this to experiment with embeddings and explore how different ways of displaying information affect your perception.

It gets the top 100 stories, sends their html to GPT-4 to extract the main content (this was not producing good enough results with html parsing) and then gets an embedding using the title and content.

Likes/dislikes are stored in local storage and compared against all stories using cosine similarity to find the most relevant stories.

It costs about $10/day to run. I was thinking of offering additional value for a small subscription. Maybe more pages of the newspaper, full story content/comments, a weekly digest or ePub export or something?

replies(4): >>42067307 #>>42067813 #>>42072116 #>>42072371 #
jzombie ◴[] No.42067813[source]
> Likes/dislikes are stored in local storage and compared against all stories using cosine similarity to find the most relevant stories.

You're referring to using the embeddings for cosine similarity?

I am doing something similar with stocks. Taking several decades worth of 10-Q statements for a majority of stocks and weighted ETF holdings and using an autoencoder to generate embeddings that I run cosine and euclidean algorithms on via Rust WASM.

replies(2): >>42072665 #>>42133823 #
1. mahin ◴[] No.42133823[source]
Yes. Your project sounds cool, post it!
replies(1): >>42143009 #
2. jzombie ◴[] No.42143009[source]
I just responded to an adjacent query with the info.

https://news.ycombinator.com/threads?id=jzombie#42072665