Don’t blindly prefer emplace_back to push_back

(quuxplusone.github.io)

178 points todsacerdoti | 1 comments | 04 Mar 21 08:02 UTC | HN request time: 0.479s | source

Show context

jokoon ◴[04 Mar 21 10:40 UTC] No.26340841[source]▶

Thanks for the article, there should be a huge list of those type of classic C++ misconceptions/FAQ style.

One of my favorite is "you don't need a linked list".

Although I'm still genuinely interested when a linked list is best suited. I'm also curious why they were invented and why they're taught, maybe just for teaching purposes...

replies(12): >>26340881 #>>26340918 #>>26340936 #>>26341086 #>>26341090 #>>26341157 #>>26341167 #>>26341203 #>>26341240 #>>26341290 #>>26341787 #>>26347246 #

lelanthran ◴[04 Mar 21 11:36 UTC] No.26341167[source]▶

>>26340841 #

> Although I'm still genuinely interested when a linked list is best suited.

I can think of a few use-cases where I've opted for linked-lists over resizable arrays:

1. When you have enough memory for the set of all elements you want to store but not enough contiguous memory for the set of all elements you want to store (i.e. You have the available memory, but it's all fragmented).

2. When you have multiple threads adding to/removing from the container, a linked list lets each thread (in theory) insert/remove nodes at the same time. For example, one thread can be adding to the head of a queue while another thread is removing from the tail of the same queue; they don't need to both wait for the same lock to resize the container.

3. When you have large insertions/removals. For example, concatenating two lists into one is just reassigning two/four pointers in linked lists. With arrays or similar you have to have enough free memory to make of a copy of the smaller list, at least).

   Another example is "zipping" two lists together - you simply go through the lists and reassign pointers. With arrays you have to allocate enough memory for the final list before copying elements over.

   Same goes for splitting lists into two: for example, applying a predicate function to remove and return all elements that satisfies the predicate takes no extra memory using lists, but with arrays will need the destination to be allocated first).

4. Finally, you may want to store cycles in your container (perhaps the traversal maintains state so you can use the cycle to make traversal of some nodes repeat, or maybe you're simply insane and like to confuse readers of your code :-).

Okay, fair enough, that last one is me stretching things a bit :-)

replies(2): >>26341243 #>>26341268 #

MauranKilom ◴[04 Mar 21 11:45 UTC] No.26341243[source]▶

>>26341167 #

Re 1. (memory fragmentation): I'm sort of on board with this (although you probably make it worse for the next person if you use linked lists). But it'll be a very special case where the performance hit from working with a huge list will be acceptable.

Re 2. (threading): What happens when the list becomes empty? How do you keep track of the size to determine whether the list is empty? How do you know the size is still up-to-date when you go on to make a modification? I don't think this is as easy (even "in theory") as you make it sound.

Re 3. (merging overhead): You say "takes no extra memory", but you use significant extra memory already: You incur an additional two pointers of (and possibly per-allocation) overhead. For the entire lifetime of the list, not just when you do some operations.

If your elements are small, then your list might have double memory consumption the entire time instead of only when merging. And if they are large, you could store unique_ptrs to those elements in a vector. In which case the total memory overhead of merging two vectors (2 x total number of elements x pointer size) is exactly the same as a linked list has the entire time. And all the operations then turn into shuffling around pointers too, but probably fewer: You just have to move the unique_ptrs around, which is one pointer each instead of constantly unlinking and relinking lists.

Sure, sometimes iterator invalidation is a thing, and maybe keeping pointers to (separately allocated) elements doesn't cut it. But even then you wouldn't want to use a list if performance is even remotely a concern.

replies(2): >>26341300 #>>26341441 #

gpderetta ◴[04 Mar 21 11:55 UTC] No.26341300[source]▶

>>26341243 #

To fight fragmentation, a deque-like container is probably better anyway.

Very good point on 3.

replies(1): >>26341479 #

MauranKilom ◴[04 Mar 21 12:25 UTC] No.26341479[source]▶

>>26341300 #

Indeed. I recently looked at using std::deque because I was exactly trying to avoid allocating too much contiguous memory, fearing fragmentation.

Sadly, std::deque is pretty much useless. It uses something like page-sized blocks in one implementation, a few cachelines in another one (forgot which is gcc/clang), and on MSVC it allocates every single item separately (!!!) for element size > 1 word. And this will not be changed due to ABI compatibility. It might as well not exist in the standard.

replies(1): >>26341669 #

1. gpderetta ◴[04 Mar 21 12:49 UTC] No.26341669[source]▶

>>26341479 #

yes, I never use std::deque for the same reason. I believe that the deque in boost these days has a compile time configurable block size.

↑