AI should only run as fast as we can catch up

> He would just spot-check the correctness of AI’s work and quickly spin up local deployments to verify it’s indeed working.

I'm not really sure how exactly he get the project done, but "spot-check" and "quickly spin up local deployments to verify" is somehow makes me somewhat unconformable.

For me, it's either unit-tests that hits at least 100% coverage, or when unit-test is inapplicable, a line-by-line letter-by-letter verification. Otherwise your "spot-check" means no shit to me.