←back to thread

764 points bertman | 2 comments | | HN request time: 0s | source
Show context
imcritic ◴[] No.43484638[source]
I don't get how someone achieves reproducibility of builds: what about files metadata like creation/modification timestamps? Do they forge them? Or are these data treated as not important enough (like it 2 files with different metadata but identical contents should have the same checksum when hashed)?
replies(10): >>43484658 #>>43484661 #>>43484682 #>>43484689 #>>43484705 #>>43484760 #>>43485346 #>>43485379 #>>43486079 #>>43488794 #
purkka ◴[] No.43484689[source]
Generally, yes: https://reproducible-builds.org/docs/timestamps/

Since the build is reproducible, it should not matter when it was built. If you want to trace a build back to its source, there are much better ways than a timestamp.

replies(1): >>43485622 #
ryandrake ◴[] No.43485622[source]
C compilers offer __DATE__ and __TIME__ macros, which expand to string constants that describe the date and time that the preprocessor was invoked. Any code using these would have different strings each time it was built, and would need to be modified. I can't think of a good reason for them to be used in an actual production program, but for whatever reason, they exist.
replies(4): >>43485670 #>>43486042 #>>43487552 #>>43488994 #
mananaysiempre ◴[] No.43486042[source]
And that’s why GCC (among others) accepts SOURCE_DATE_EPOCH from the environment, and also has -Wdate-time. As for using __DATE__ or __TIME__ in code, I suspect that was more helpful in the age before ubiquitous source control and build IDs.
replies(1): >>43488470 #
cperciva ◴[] No.43488470{3}[source]
Source control only helps you if everything is committed. If you're, say, working on changes to the FreeBSD boot loader, you're probably not committing those changes every time you test something but it's very useful to know "this is the version I built ten minutes ago" vs "I just booted yesterday's version because I forgot to install the new code after I built it".
replies(5): >>43489912 #>>43490489 #>>43492048 #>>43492803 #>>43492948 #
1. jrockway ◴[] No.43492048{4}[source]
Versions built into the code are nice. I think the correct answer is to commit before the build proper starts (automatically, without changing your HEAD ref) and put that in there. Then you can check version control for the date information, but if someone else happens to add the same bytes to the same base commit, they also have the same version that you do. (Similarly, you can always make the date "XXXXXXXXXXXXXXXXXXXXXX" or something, and just replace the bytes with the actual date after the build as you deploy it.)

What I actually did at $LAST_JOB for dev tooling was to build in <commit sha> + <git diff | sha256> which is probably not amazingly reproducible, but at least you can ask "is the code I have right now what's running" which is all I needed.

Finally, there is probably enough flexibility in most build systems to pick between "reuse a cache artifact even if it has the wrong stamping metadata", "don't add any real information", and "spend an extra 45 cpu minutes on each build because I want $time baked into a module included by every other source file". I have successfully done all 3 with Bazel, for example.

replies(1): >>43500244 #
2. ◴[] No.43500244[source]