←back to thread

480 points jedeusus | 7 comments | | HN request time: 0.632s | source | bottom
Show context
jensneuse ◴[] No.43540964[source]
You can often fool yourself by using sync.Pool. pprof looks great because no allocs in benchmarks but memory usage goes through the roof. It's important to measure real world benefits, if any, and not just synthetic benchmarks.
replies(2): >>43541261 #>>43543697 #
1. makeworld ◴[] No.43541261[source]
Why would Pool increase memory usage?
replies(2): >>43541282 #>>43543225 #
2. xyproto ◴[] No.43541282[source]
I guess if you allocate more than you need upfront that it could increase memory usage.
replies(1): >>43542705 #
3. throwaway127482 ◴[] No.43542705[source]
I don't get it. The pool uses weak pointers under the hood right? If you allocate too much up front, the stuff you don't need will get garbage collected. It's no worse than doing the same without a pool, right?
replies(1): >>43543649 #
4. jensneuse ◴[] No.43543225[source]
Let's say you have constantly 1k requests per second and for each request, you need one buffer, each 1 MiB. That means you have 1 GiB in the pool. Without a pool, there's a high likelihood that you're using less. Why? Because in reality, most requests need a 1 MiB buffer but SOME require a 5 MiB buffer. As such, your pool grows over time as you don't have control over the distribution of the size of the pool items.

So, if you have predictable object sizes, the pool will stay flat. If the workloads are random, you have a new problem because, like in this scenario, your pool grows 5x more.

You can solve this problem. E.g. you can only give back items into the pool that are small enough. Alternatively, you could have a small pool and a big pool, but now you're playing cat and mouse.

In such a scenario, it could also work to simply allocate and use GC to clean up. Then you don't have to worry about memory and the lifetime of objects, which makes your code much simpler to read and reason about.

replies(2): >>43546636 #>>43550822 #
5. cplli ◴[] No.43543649{3}[source]
What the top commenter probably failed to mention, and jensneuse tried to explain is that sync.Pool makes an assumption that the size cost of pooled items are similar. If you are pooling buffers (eg: []byte) or any other type with backing memory which during use can/will grow beyond their initial capacity, can lead to a scenario where backing arrays which have grown to MB capacities are returned by the pool to be used for a few KB, and the KB buffers are returned to high memory jobs which in turn grow the backing arrays to MB and return to the pool.

If that's the case, it's usually better to have non-global pools, pool ranges, drop things after a certain capacity, etc.:

https://github.com/golang/go/issues/23199 https://github.com/golang/go/blob/7e394a2/src/net/http/h2_bu...

6. jerf ◴[] No.43546636[source]
Long before sync.Pool was a thing, I wrote a pool for []bytes: https://github.com/thejerf/gomempool I haven't taken it down because it isn't obsoleted by sync.Pool because the pool is aware of the size of the []bytes. Though it may be somewhat obsoleted by the fact the GC has gotten a lot better since I wrote it, somewhere in the 1.3 time frame. But it solve exactly that problem I had; relatively infrequent messages from the computer's point of view (e.g., a system that is probably getting messages every 50ms or so), but that had to be pulled into buffers completely to process, and had highly irregular sizes. The GC was doing a ton of work when I was allocating them all the time but it was easy to reuse buffers in my situation.
7. theThree ◴[] No.43550822[source]
>That means you have 1 GiB in the pool.

This only happen when every request last 1 second.