For this to "work" you need to have a metric that shows that AIs perform as well, or nearly as well, as with the uncompressed documentation on a wide range of tasks.
For this to "work" you need to have a metric that shows that AIs perform as well, or nearly as well, as with the uncompressed documentation on a wide range of tasks.
Cherry picking a tiny example, this wouldn't capture the fact that cloudflare durable objects can only have one alarm at a time and each set overwrites the old one. The model will happily architect something with a single object, expecting to be able to set a bunch of alarms on it. Maybe I'm wrong and this tool would document it correctly into a description. But this is just a small example.
For much of a framework or library, maybe this works. But I feel like (in order for this to be most effective) the proposed spec possibly needs an update to include little more context.
I hope this matures and works well. And there's nothing stopping me from filling in gaps with additional docs, so I'll be giving it a shot.