←back to thread

304 points nimbusega | 1 comments | | HN request time: 0.21s | source
Show context
nimbusega ◴[] No.42067000[source]
I made this to experiment with embeddings and explore how different ways of displaying information affect your perception.

It gets the top 100 stories, sends their html to GPT-4 to extract the main content (this was not producing good enough results with html parsing) and then gets an embedding using the title and content.

Likes/dislikes are stored in local storage and compared against all stories using cosine similarity to find the most relevant stories.

It costs about $10/day to run. I was thinking of offering additional value for a small subscription. Maybe more pages of the newspaper, full story content/comments, a weekly digest or ePub export or something?

replies(4): >>42067307 #>>42067813 #>>42072116 #>>42072371 #
jzombie ◴[] No.42067813[source]
> Likes/dislikes are stored in local storage and compared against all stories using cosine similarity to find the most relevant stories.

You're referring to using the embeddings for cosine similarity?

I am doing something similar with stocks. Taking several decades worth of 10-Q statements for a majority of stocks and weighted ETF holdings and using an autoencoder to generate embeddings that I run cosine and euclidean algorithms on via Rust WASM.

replies(1): >>42072665 #
1. tiborsaas ◴[] No.42072665[source]
> I am doing something similar with stocks.

How well does it work?