The obvious question that never gets answered is how does it defend from prompt injection? If customers can use prompt injection to make Claudius do something it shouldn't, it's not usable in the real world. What good is an agent that can be convinced to actually order 1000 tungsten cubes?