←back to thread

764 points bertman | 1 comments | | HN request time: 0.211s | source
Show context
imcritic ◴[] No.43484638[source]
I don't get how someone achieves reproducibility of builds: what about files metadata like creation/modification timestamps? Do they forge them? Or are these data treated as not important enough (like it 2 files with different metadata but identical contents should have the same checksum when hashed)?
replies(10): >>43484658 #>>43484661 #>>43484682 #>>43484689 #>>43484705 #>>43484760 #>>43485346 #>>43485379 #>>43486079 #>>43488794 #
o11c ◴[] No.43484682[source]
Timestamps are easiest part - you just set everything according to the chosen epoch.

The hard things involve things like unstable hash orderings, non-sorted filesystem listing, parallel execution, address-space randomization, ...

replies(1): >>43485157 #
koolba ◴[] No.43485157[source]
ASLR shouldn’t be an issue unless you intend to capture the entire memory state of the application. It’s an intermediate representation in memory, not an output of any given step of a build.

Annoying edge cases come up for things like internal object serialization to sort things like JSON keys in config files.

replies(3): >>43485872 #>>43488447 #>>43489756 #
sodality2 ◴[] No.43485872[source]
Let’s say a compiler is doing something in a multi-threaded manner - isn’t it possible that ASLR would affect the ordering of certain events which could change the compiled output? Sure you could just set threads to 1 but there’s probably some more edge cases in there I haven’t thought of.
replies(1): >>43486161 #
1. zamadatix ◴[] No.43486161[source]
I think you'd need the compiler to guarantee serialization order of such operations regardless if you used ASLR or not. Otherwise you're just hoping thread scheduling, core clocking, thread memory access, and many other things are the same between every system trying to do a reproducible build. Even setting threads to 1 may not solve that problem class if asynchronous functions/syscalls come into play.