As a result, you'll never be able to get 100% consistent outputs or behavior (like you hypothetically can with a traditional algorithm/business logic). And that has proven out in usage across every model I've worked with.
There's also an upper-bound problem in terms of context where every LLM hits some arbitrary amount of context that causes it to "lose focus" and develop a sort of LLM ADD. This is when hallucinations and random, unrequested changes get made and a previously productive chat spirals to the point where you have to start over.