I've come across something related when building the indexing tool for my vintage ad archive using OpenAI vision. No matter how I tried to prompt engineer the entity extraction into the defined structure I was looking for, OpenAI simply has its own ideas. Some of those ideas are actually good! For example it was extracting celebrity names, I hadn't thought of that. For other things, it would simply not follow my instructions. So I decided to just mostly match what it chooses to give me. And I have a secondary mapping on my end to get to the final structure.
replies(1):