AlphaGo got better by playing against itself. I wonder if the pathway forward here is to essentially do the same with coding. Feed it some arbitrary SRS documents - have it attempt to develop them including full code coverage testing. Have it also take on roles of QA, stakeholders, red-team security researchers, and users who are all aggressively trying to find edge cases and point out everything wrong with the application. Have it keep iterating and learn from the findings. Keep feeding it new novel SRSs until the number off attempts/iterations necessary to get a quality product out the other side drops to some acceptable number.