(www.thealgorithmicbridge.com)

1005 points vinhnx | 3 comments | 12 Apr 25 03:58 UTC | HN request time: 0.767s | source

1. paradite ◴[12 Apr 25 08:50 UTC] No.43662612[source]▶

The author mentioned AlphaGo and Alpha Zero without mentioning OpenAI gym and OpenAI Five.

Those products show OpenAI was innovating and leading in RL at that stage around 2017 to 2019.

replies(1): >>43675355 #

2. bitpush ◴[13 Apr 25 19:51 UTC] No.43675355[source]▶

This is the first I'm hearing about it.

replies(1): >>43676937 #

3. paradite ◴[14 Apr 25 00:24 UTC] No.43676937[source]▶

I forgot to mention that OpenAI also invented PPO, which is the default algorithm that everyone uses for RL since 2017: https://en.wikipedia.org/wiki/Proximal_policy_optimization

DeepSeek's GRPO is also just a minor variant of PPO.

Google is winning on every AI front