←back to thread

Interview with gwern

(www.dwarkeshpatel.com)
308 points synthmeat | 2 comments | | HN request time: 2.094s | source
Show context
nutanc ◴[] No.42134935[source]
Experimenting with creating semantic chunks of large podcasts. Got the following chunks, https://gist.github.com/nutanc/a9e6321649be5ea9806b4450b0bd6...

Dwarkesh has 18 splits. https://www.dwarkeshpatel.com/i/151435243/timestamps

I got 171. So roughly 9 context discussions in one time stamp.

replies(1): >>42140515 #
maujim ◴[] No.42140515[source]
What did you use to create the chunks?
replies(1): >>42144308 #
1. nutanc ◴[] No.42144308[source]
It's a new approach I am experimenting with.

https://gpt3experiments.substack.com/p/a-new-chunking-approa...

replies(1): >>42149835 #
2. maujim ◴[] No.42149835[source]
Interesting approach. Have you tried using existing clustering algorithms to determine the chunks instead of using distance or slope as a measure?