Most active commenters
  • aizk(4)

←back to thread

WikiTok

(wikitok.vercel.app)
1459 points Group_B | 26 comments | | HN request time: 0.208s | source | bottom
1. xhrpost ◴[] No.42937042[source]
Wonder what it would take to add a simple algorithm to this. Part of what makes short media apps (dangerously) addictive is that they eventually learn what you like and feed you more of that. An app like this with such an algo could help with the stickiness (and presumably get us away from the other apps at least for a little bit). "Oh this person likes science stuff, let's feed them more, oh they specifically like stuff related to quantum mechanics, let's place a summary paragraph from a related page topic in there."
replies(9): >>42937154 #>>42937413 #>>42937430 #>>42937776 #>>42938084 #>>42938362 #>>42938741 #>>42942269 #>>42945319 #
2. aizk ◴[] No.42937154[source]
On one hand I am thinking about what a very basic algorithm would like (maybe even just categories I might do) and maybe how it would make people happy.

On the other hand, I'm not sure exactly the details of wikipedia's api TOS. Also as it stands this website is entirely in the frontend at the moment, and I'm enjoying just scaffolding out what I can with limited a more limited set of tools to speak.

I realize now the suffix "tok" implies a crazy ML algo that is trained every single movement, click, tap, and pause you make, but I don't think I really want that.

replies(6): >>42937849 #>>42938418 #>>42939101 #>>42939505 #>>42941131 #>>42949615 #
3. easterncalculus ◴[] No.42937413[source]
That's what I was thinking this might have already. Maybe this could get insights from the articles linked from the ones you like too? Sort of like https://www.sixdegreesofwikipedia.com/
4. marci ◴[] No.42937430[source]
RHAAS

Rabbit-holing as a service

replies(1): >>42938131 #
5. TZubiri ◴[] No.42937776[source]
The relatedness of articles is already baked in with blue wiki links too. So it shouldn't be too hard to make something that just looks for neighbors.

Now, something that learns that if you like X you might like Y, even if they are disconnected. Is closer to the dystopic ad maximizing algorithm of TikTok et al.

6. codingdave ◴[] No.42937849[source]
It should be possible to keep this all front-end, even with some basic algorithm for the searches - just use localStorage. That keep things simple and resolve privacy concerns, as people own their data and can delete them any time.
7. ya1sec ◴[] No.42938084[source]
it starts with sourcing - finding a massive set of interesting pages, then going through and giving them tags. planning on adding this to my web discovery app as well: https://moonjump.app/
8. belinder ◴[] No.42938131[source]
tvtropes did it first
replies(2): >>42940266 #>>42946011 #
9. mvieira38 ◴[] No.42938362[source]
This would eventually collapse to people reading articles they do not actually like (i.e. get happiness from reading), I think, maybe tragic history facts or something like that? The truth of social media harm is that it's more about humans than the algorithms themselves. Humans just tend to engage more with negative emotions. Even IRL we tend to look for intrigue and negative interactions, just look at the people who stay with toxic partners even with no financial ties, or even friend groups who turn into dysfunctional gossip fests. The only way to avoid this is by actively fighting against this tendency, and having no algorithm at all in an application helps.
10. keerthiko ◴[] No.42938418[source]
browser-store and cookies, among other tools, provide nice front-end-only persistent storage for holding things like recommendation weights/scoring matrices. maybe a simple algorithm that can evaluate down from a few bytes stored in weights might be all the more elegant.
11. aDyslecticCrow ◴[] No.42938741[source]
For each 10 seconds of reading, increment the tags on the current article as "favoured". Then, poll randomly from those tags for the next recommended article. Add some logarithms of division to prevent the tags from infinite scaling.
replies(2): >>42938998 #>>42941537 #
12. epolanski ◴[] No.42938998[source]
Can you tell that YouTube reels engineers? Because their Algo is a disaster where I'm only fed Sopranos and NBA content. I don't hate it, but god I have so many subscriptions (civil aviation, personal finance, etc) that I never ever see on my feed.
13. layman51 ◴[] No.42939101[source]
About the “Tok” suffix, I also think that while it has the algorithm connotations, it also has been used a lot to describe communities that have formed on TikTok. For example, BookTok (where some bookstores have started to pay attention to how people on TikTok can make some books popular again seemingly on a whim) or WitchTok.
replies(1): >>42946442 #
14. aizk ◴[] No.42939505[source]
Update - I chatted with some devs at wikipedia, and they confirmed I'm not hitting their servers hard, which is great.
replies(1): >>42939612 #
15. Aeolun ◴[] No.42939612{3}[source]
Compared to default wikipedia traffic this should be a drop in a bucket right?
replies(2): >>42940207 #>>42943855 #
16. aizk ◴[] No.42940207{4}[source]
Probably? I have no frame of reference, I've never done giant distributed systems before. I just noticed that earlier version had some slowdowns but I think I was just improperly fetching the images ahead of time.
17. aizk ◴[] No.42940266{3}[source]
Oh I just looked, sadly tv tropes doesn't have an API. I'd love to work off their data but that would be a bit more involved.
18. valec ◴[] No.42941131[source]
keep user profiles maybe with cookies or by encouraging sign-ups and then use NMF

https://en.wikipedia.org/wiki/Non-negative_matrix_factorizat...

19. istjohn ◴[] No.42941537[source]
Do you mind expanding on the last sentence?
replies(1): >>42983165 #
20. tbossanova ◴[] No.42942269[source]
I would prefer at least an option to keep it on random mode. Both for the occasional exposure to cool stuff and to make it less rabbit-holey.
21. ◴[] No.42943855{4}[source]
22. hummuscience ◴[] No.42945319[source]
Since its text, especially text with links to other articles, there is no need for tags.

If I had a clue how to do this (sorry, just a neuroscientist), I would probably create "communities" of pages on a network graph and weight the traversal across the graph network based on pages that the person liked (or spend X time on before).

23. marci ◴[] No.42946011{3}[source]
Where? I thought it was just the wikipedia of tv tropes.
24. l3x4ur1n ◴[] No.42946442{3}[source]
StickTok where people show cool sticks they found in the nature!
25. bawolff ◴[] No.42949615[source]
> On the other hand, I'm not sure exactly the details of wikipedia's api TOS

https://www.mediawiki.org/wiki/API:Etiquette

You are basically allowed to do whatever as long as it doesn't cause an operational issue, you dont have too many requests in-flight at one time , and you put contact info in the user-agent or Api-User-Agent header. (Adding a unique api-user-agent header is probably the most important requirement, since if it does cause problems it lets operations team easily see what is happening)

I think the wiktok thing is exactly the sort of thing wikimedia folks hope people will use the api to create.

26. aDyslecticCrow ◴[] No.42983165{3}[source]
I misspelt the last sentence a bit. I meant division "or" logarithm.

Basically, we have an unbounded counter that is gonna start breaking things. So we need to normalize it to a percentage score (by dividing it by the total favoured count across all tags), or pass it through a logarithm to bound it.

This approach only works if all content is accurately tagged, which works basically nowhere on the internet except Wikipedia.