←back to thread

213 points shcheklein | 2 comments | | HN request time: 0s | source
Show context
jerednel ◴[] No.41889752[source]
It's not super clear to me how this interacts with data. If I have am using ADLS to store delta tables, and I cannot pull prod to my local can I still use this? Is there a point if I can just look at delta log to switch between past versions?
replies(1): >>41889814 #
riedel ◴[] No.41889814[source]
DVC is (at least as I use it) pretty much just git LFS with multiple backends (guess actually a more simple git annex). It further has some rather MLOps specific stuff. Is handy if you do versions model training with changing data on S3.
replies(3): >>41890760 #>>41890767 #>>41890837 #
1. haensi ◴[] No.41890767[source]
There’s another thread from October 2022 on that topic.

https://news.ycombinator.com/item?id=33047634

What makes DVC especially useful for MLOps? Aren’t MLFlow or W&B solving that in a way that’s open source (the former) or just increases the speed and scale massively ( the latter)?

Disclaimer: I work at W&B.

replies(1): >>41891199 #
2. riedel ◴[] No.41891199[source]
DVC is much more basic (feels more unix style), integrates really well with any simple CI/CD scripting with git versioning without the need to set up any additional servers.

And it is not either or. People actually combine MLFlow and SVC [0]

[0] https://data-ai.theodo.com/blog-technique/dvc-pipeline-runs-...