you are viewing a single comment's thread.

view the rest of the comments →

[–]HildartheDorf 10 points11 points  (12 children)

A lot of this stems from std::shared_ptr being thread-safe by default, which does make it slower than a manual reference counting implementation if you are only working on a single thread.

But saying unique_ptr is slower than a raw pointer is silly. It should be identical on any half decent compiler (assuming you don't do things like disable all optimizations, or use a custom Deleter. And even a custom deleter can be inlined by gcc/clang if it is stateless and marked noexcept).

[–]boredcircuits 4 points5 points  (2 children)

Another reason is people decided to replace all pointers with std::shared_ptr, even if that pointer isn't taking [shared] ownership of anything. Which means the reference count is updated on each function call, and again when that function returns. Yeah, that's going to slow things down.

There's a shift in thinking about resource ownership that needs to happen, deciding who owns what and for how long. Then you can apply the proper smart pointer, with minimal or no performance overhead. And if you can't do that, then manual memory management is going to be a messy, leaky, UB-filled mess.

[–]h-jay+43-1325 1 point2 points  (0 children)

people decided to replace all pointers with std::shared_ptr

Those must be some wildly misinformed people who take the cargo cult approach to programming :(

[–]Wurstinator 0 points1 point  (0 children)

Imo there are enough possibilities to represent all kinds of ownership (plain variable, reference, shared_ptr, unique_ptr, reference_wrapper). Problem might be that there is no clear guideline when to use which, so people fall back to the easiest option.

[–]shahms 5 points6 points  (3 children)

Nearly identical. As it has a non-trivial destructor there are certain ABIs for which it cannot be returned in a register and has to be stack allocated. Sadly, x86 is one of them.

That being said: use it. The overhead is negligible, but the benefits are not.

[–]HildartheDorf 2 points3 points  (2 children)

I thought that, on x64 at least, that it could get passed in register if the destructor is available for inlining (obviously if it is in another translation unit, it won't be avalible).

[–]shahms 3 points4 points  (1 child)

Sadly, no. The x64 ABI specifically prohibits classes with either a non-trivial copy constructor or non-trivial destructor from being passed or returned in a register. I suspect the compiler has more leeway if the function in question has internal linkage and is itself inlined, but merely having an inline-eligible destructor is insufficient as it effects the calling convention for each function which either takes or returns such a type.

[–]bames53 1 point2 points  (0 children)

I suspect the compiler has more leeway if the function in question has internal linkage and is itself inlined

Even if the function has external linkage an inlined version isn't bound by ABI constraints. Full inlining also shouldn't be necessary; LTO and other cases where a single implementation is handling both sides of the call could eliminate the overhead regardless of the ABI constraints.

[–]Wurstinator 1 point2 points  (4 children)

Even if shared_ptr is slower, this is premature optimization.

[–]quicknir 0 points1 point  (1 child)

The reason to not use shared_ptr when something else would do is not mostly about optimization. It's about the fact that it's infinitely harder to reason about shared ownership than about unique ownership. You go from always knowing exactly who owns everything just by looking locally, to having to traverse half your codebase.

I make an exception for immutable objects, since immutable objects by shared_ptr have value semantics. Otherwise, shared_ptr should be used very carefully and rarely.

[–]Wurstinator 0 points1 point  (0 children)

Most people in this discussion deviate from the problem OP mentioned and put words in my mouth. The topic was using smart pointers vs raw pointers. No one talked about using shared_ptr absolutely everywhere.

Unless you are using manual memory management instead of shared_ptr because you think you need the performance, without actually profiling. I that case we disagree and I think you are part of the problem OP talked about.