Would love to contribute. I have made a fork, will try and raise a PR if contributions are welcome.
Question, how are you testing this? Like doing it on dummy data is a bit too easy. These models, even 4o, falter when it comes to something really specific to a domain (like I work with supply chain data and other column names specific to the work that I do, that only makes sense to me and my team, but wouldn't make any sense to an LLM unless it somehow knows what those columns are)
replies(1):