←back to thread

1311 points msoad | 1 comments | | HN request time: 0.2s | source
Show context
w1nk ◴[] No.35394065[source]
Does anyone know how/why this change decreases memory consumption (and isn't a bug in the inference code)?

From my understanding of the issue, mmap'ing the file is showing that inference is only accessing a fraction of the weight data.

Doesn't the forward pass necessitate accessing all the weights and not a fraction of them?

replies(4): >>35394751 #>>35396440 #>>35396507 #>>35398499 #
1. jhatemyjob ◴[] No.35396440[source]
If you read a file with malloc and memcpy, it copies the data from the kernel to userspace. With mmap there is no copying.