←back to thread

245 points gatesn | 1 comments | | HN request time: 0.205s | source
Show context
the_mitsuhiko ◴[] No.41840459[source]
> One of the unique attributes of the (in-progress) Vortex file format is that it encodes the physical layout of the data within the file's footer. This allows the file format to be effectively self-describing and to evolve without breaking changes to the file format specification.

That is quite interesting. One challenge in general with parqet and arrow in the otel / observability ecosystem is that the shape of data is not quite known with spans. There are arbitrary attributes on them, and they can change. To the best of my knowledge no particularly great solution exists today for encoding this. I wonder to which degree this system could be "abused" for that.

replies(8): >>41840665 #>>41842038 #>>41842282 #>>41842347 #>>41843259 #>>41844697 #>>41846992 #>>41848634 #
agoose77 ◴[] No.41848634[source]
For fun, the ROOT file format used in high energy physics has this kind of feature: https://root.cern.ch/root/SchemaEvolution.pdf

It's also a very old format, so not without its warts :)

replies(1): >>41851621 #
1. amadio ◴[] No.41851621[source]
There is also a new format being developed for Run 4, RNTuple:

- https://indico.fnal.gov/event/23628/contributions/240607/

- https://indico.cern.ch/event/1338689/contributions/6077632/