←back to thread

213 points shcheklein | 1 comments | | HN request time: 0.204s | source
Show context
dmpetrov ◴[] No.41890616[source]
hi there! Maintainer and author here. Excited to see DVC on the front page!

Happy to answer any questions about DVC and our sister project DataChain https://github.com/iterative/datachain that does data versioning with a bit different assumptions: no file copy and built-in data transformations.

replies(3): >>41890932 #>>41896923 #>>41897005 #
ajoseps ◴[] No.41890932[source]
if the data files are all just text files, what are the differences between DVC and using plain git?
replies(3): >>41891059 #>>41891080 #>>41893500 #
1. dmpetrov ◴[] No.41891059[source]
In this cases, you need DVC if:

1. File are too large for Git and Git LFS.

2. You prefer using S3/GCS/Azure as a storage.

3. You need to track transformations/piplines on the file - clean up text file, train mode, etc.

Otherwise, vanilla Git may be sufficient.