You're really bringing up false dichotomies all the time.
Nobody ever argued that it was about the programming languages or equal implementations, but about project stewardship and diverging code bases. They influence each other, it's not only about one or the other.
I don't know where you get your weird 100% bug compatibility idea from, that's literally how nothing is handled anywhere.
This is also orthogonal to specs, you can have specs that completely dictate specifications (like CORBA) or that are super loose in what they allow (ANSI C).
There are not only reference documents but also reference implementations, as projects grow it's ok do diverge from them, and find common ground in other documents like specs.
Sometimes they cover reasonable behaviour so well that they can work as an alternative to a specification, like sqlite and https://sqljet.com/ . That doesn't mean that they'll never change, SQLite regularly has bugs discovered and fixed. If the SQLite devs don't even adhere to your assumed "aLL bUgS aND BEhAViOUrs aRE SAcrEd AnD MUsT Be KEpT InDeFInitElY" philosophy, why would anybody else?
As if there is some kind of weird rigid black and white process involved with these complex projects, that is either good base implementation and no spec ever in the future with 100% backwards compatibility, or waterfall spec development followed by implementations that asymptotically approach the spec.
Where theres a will theres a way, these projects and documents are all about people and the ways they collaborate and work. It's not as rigid as you make it out to be.
>When I come across a bug how do I know whether it's a bug or something someone forgot to document properly? What if that bug isn't present in another implementation?
You do what you currently do. You go to a place where the people that steward the project reside and you ask.
Why and how do you think specs get revised? They contain ambiguities, bugs, and unspecified behaviour. Somebody stumbles upon it, and asks a question.
>What about when someone comes along and suggests something that would work really well but it turns out that nobody else with a different code design could reasonably implement it?
You'd do what you currently do. You talk about it, and in the end you might even write it down somewhere, in a spec, in an rfc, in a piece of documentation.
You seem to think that SQLite would stay the reference implementation for ever, which is simply not true. It's a good starting point yeah. But webkit didn't stay the reference implementation either, nor did netscape.
Don't be so rigid.