←back to thread

213 points shcheklein | 3 comments | | HN request time: 0.525s | source
Show context
causal ◴[] No.41891218[source]
This useful for large binaries?
replies(4): >>41891754 #>>41892002 #>>41892136 #>>41895103 #
1. natsucks ◴[] No.41892002[source]
Would appreciate a good answer to this question. I deal with large medical imaging data (DICOM) and i cannot tell whether it's worth it and/or feasible.
replies(2): >>41892510 #>>41895148 #
2. thangngoc89 ◴[] No.41892510[source]
It's very much feasible. I'm currently using DVC for DICOM, the repo has growth to about 5TB of small dcm files (less than < 100KB each). We use a NFS mounted NAS for development but the DVC's cache needs to be on the NVMe, otherwise performance would be terrible.
3. tomnicholas1 ◴[] No.41895148[source]
You should look at Icechunk. Your imaging data is structured (it's a multidimensional array), so it should be possible be to represent it as "Virtual Zarr". Then you could commit it to an Icechunk store.

https://earthmover.io/blog/icechunk