Here’s a few problems I foresee:
1. People get lazy when presented with four choices they had no hand in creating, and they don’t look over the four and just click one, ignoring the others. Why? Because they have ten more of these on the go at once, diminishing their overall focus.
2. Automated tests, end-to-end sim., linting, etc—tools already exist and work at scale. They should be robust and THOROUGHLY reviewed by both AI and humans ideally.
3. AI is good for code reviews and “another set of eyes” but man it makes serious mistakes sometimes.
An anecdote for (1), when ChatGPT tries to A/B test me with two answers, it’s incredibly burdensome for me to read twice virtually the same thing with minimal differences.
Code reviewing four things that do almost the same thing is more of a burden than writing the same thing once myself.