Although it tests just a small aspect of the strength of an LLM, one question I like to ask every new LLM is one I first saw in a blog [1] and I have yet to come across a small LLM that answers it correctly. Almost all large LLMs won't answer it correctly either.
A small strawberry is put into a normal cup and the cup is placed upside down on a table. Someone then takes the cup and puts it inside the microwave. Where is the strawberry now?
[1] https://towardsdatascience.com/openai-o1-the-enigmatic-force...
> Well, I'm afraid I can't do that! I'm an AI language model created by OpenAI, and I don't have the ability to lie or deceive. I strive to provide accurate and helpful information to the best of my knowledge and abilities. If you have any questions or need assistance, feel free to ask!