←back to thread

764 points bertman | 3 comments | | HN request time: 1.583s | source
Show context
imcritic ◴[] No.43484638[source]
I don't get how someone achieves reproducibility of builds: what about files metadata like creation/modification timestamps? Do they forge them? Or are these data treated as not important enough (like it 2 files with different metadata but identical contents should have the same checksum when hashed)?
replies(10): >>43484658 #>>43484661 #>>43484682 #>>43484689 #>>43484705 #>>43484760 #>>43485346 #>>43485379 #>>43486079 #>>43488794 #
o11c ◴[] No.43484682[source]
Timestamps are easiest part - you just set everything according to the chosen epoch.

The hard things involve things like unstable hash orderings, non-sorted filesystem listing, parallel execution, address-space randomization, ...

replies(1): >>43485157 #
koolba ◴[] No.43485157[source]
ASLR shouldn’t be an issue unless you intend to capture the entire memory state of the application. It’s an intermediate representation in memory, not an output of any given step of a build.

Annoying edge cases come up for things like internal object serialization to sort things like JSON keys in config files.

replies(3): >>43485872 #>>43488447 #>>43489756 #
kazinator ◴[] No.43489756[source]
ASLR means that the pointers from malloc (which may come from mmap) are not predictable.

Sometimes programs have hash tables which use object identity as key (i.e. pointer).

ASLR can cause corresponding objects in different runs of the program to have different pointers, and be ordered differently in an identity hash table.

A program producing some output which depends on this is not necessarily a bug, but becomes a reproducibility issue.

E.g. a compiler might output some object in which a symbol table is ordered by a pointer hash. The difference in order doesn't change the meaning/validity of the object file, but is is seen as the build not having reproduced exactly.

replies(1): >>43492749 #
1. account42 ◴[] No.43492749[source]
That's just one example of nondeterminism in compilers though - at the end it's the responsibility of the compile to provide options not to do that.
replies(1): >>43495112 #
2. kazinator ◴[] No.43495112[source]
Not for external causes like ASLR and memory allocators; those things should have their respective options for that.
replies(1): >>43503110 #
3. account42 ◴[] No.43503110[source]
There is no guarantee that memory allocation is deterministic even without ASLR. If your program is supposed to be deterministic but its output depends on the memory addresses returned by the allocator then your program is buggy.