https://www.gbif.org/news/6aw2VFiEHYlqb48w86uKSf/chatipt-sys...
It's still in beta.
Press release:
Rukaya Johaadien's chatbot provides conversation-style support to students and researchers who hold biodiversity data but are first-time or infrequent data publishers. Its prompts guide users as it cleans and standardizes spreadsheets, creates basic metadata, and publishes well-structured datasets on GBIF.org as a Darwin Core Archive.
To date, publishing high quality data from PhD and Master's degrees and other small-scale biodiversity research studies has been difficult to do at scale. Standardizing data typically requires specialist knowledge of programming languages, data management techniques, and familiarity with specialist software.
Meanwhile, the process of gaining access to existing instances of the Integrated Publishing Toolkit (IPT)—the GBIF network's workhorse application for data sharing run by node staff with limited time and resources—can test a novice's patience. Training can do little to surmount such logistical barriers and others, like language, when occasional users forget the precise steps and details from year to year.
"Data standardization is hard, and biologists don't become biologists because they like coding or Excel, so a lot of potentially valuable data falls by the wayside," said Johaadien. "Recognizing that large language models have gotten really good at generating code and working with data, I built an automated tool to guide non-technical users through routine questions and process their messy data as much as possible, then publish it quickly and automatically to GBIF."