- auto detect and convert column types
- save as a parquet file in a folder
- then autogenerate a sqlalchemy table/metadata file in python for all tables with sensible defaults for column types (e.g. 2x the longest string in a column for varchar)
- build the db and all tables
- load data from the files into the tables
this makes it really easy to bootstrap the entire db from a folder of parquet files for testing with sqlite and then makes it easy to move to prod on postgres/sqlserver etc. Before I go to prod i still have to add constraints and keys and indexes but that doesn't take too long. and for dev/testing the data's not too big so performance doesn't really suffer from lack of keys/constraints then we can use something like alembic on the big sqlalchemy tables definition file to do db migrations.
it's kind of like this: https://github.com/agronholm/sqlacodegen but solving an inverse problem.
basically it bootstraps the db and schemas and gets me like 95% of the way there. my quality of life is better with it.