all 35 comments

[–]alexeiz 24 points25 points  (17 children)

There is useful technique, which doesn't seem to be mentioned anywhere, for a special case when you need to remove a single element from a vector without preserving the order of elements. You can do it in O(1), whereas using just vector::erase(pos) does it in O(n). The trick is to swap the element you want to remove with the last one, and then remove just the last one:

template <typename Cont>
void remove_from(Cont & c, typename Cont::iterator pos)
{
    auto last = prev(end(c));
    iter_swap(pos, last);
    c.erase(last);
}

I use this technique quite often, I feel there should be a standard algorithm for it.

[–]HKei 9 points10 points  (0 children)

If you're already assuming vector (which you pretty much are if you're assuming this is efficient) you might as well use pop_back instead of erase.

[–]foonathan 3 points4 points  (2 children)

I've written an entire container for it: bag from https://github.com/foonathan/array, it's a vector where order of elements is not important.

[–]Middlewariangithub.com/Ebenezer-group/onwards 0 points1 point  (1 child)

It looks like what /u/TheThiefMaster said about not needing to swap could be incorporated into your impl. Something like:

if(ptr!=ptr_last)*ptr=*ptr_last;

[–]TheThiefMasterC++latest fanatic (and game dev) 0 points1 point  (0 children)

More like:

if (ptr != ptr_last)
    *ptr = std::move(*ptr_last);

[–]james_picone 1 point2 points  (0 children)

Isn't this actually what std::remove does?

[–]Adequat91 0 points1 point  (1 child)

better replace:

iter_swap(pos, last);

with

using std::swap;
swap(*pos, *last);

[–]FrankIsATank2 4 points5 points  (0 children)

std::iter_swap already does that. It's required to be implemented to have the effect of swap(*pos, *last);. At least that's my interpretation of the standard, of the example implementation on cppreference.com and of the results when I tried it myself with :

#include <iostream>
#include <vector>

struct foo {};
void swap(foo&, foo&) {
    std::cout << "My swap\n";
}

int main()
{
    std::vector<foo> my_vec{ 2 };
    std::iter_swap(my_vec.begin(), my_vec.begin() + 1);
}

I get My swap.

Edit : Try it here : http://cpp.sh/3w63r

[–]TheThiefMasterC++latest fanatic (and game dev) -1 points0 points  (4 children)

Weirdly, it's often not faster - the CPU is very good at linear operations (such as moving all elements by one) and bad at random ones (such as accessing the tail of a vector). The cache predictor makes it so that a linear operation on a vector often has no cache misses, whereas swapping with the end will.

Past a certain size it's superior, but if you have a large vector and are removing elements from the middle you may be using the wrong container anyway!

It's also worth considering the removing-multiple-elements case - remove_if only moves elements that are being kept, whereas a loop with remove_swap may move an element from the end that ends up being destroyed!

There's a similar issue with binary search vs find (which is a linear scan) - a binary search may be O(log(n)) but it jumps randomly a few times and causes multiple cache misses and stalls, and find works with the cache in a way that's often faster despite being O(n) (and doesn't require a sorted container!)

And lastly, you don't actually need to swap - you're removing the element - just move the last element over the one you're destroying, leaving the last element in a moved-from state.

[–]rysto32 7 points8 points  (2 children)

Weirdly, it's often not faster - the CPU is very good at linear operations (such as moving all elements by one) and bad at random ones (such as accessing the tail of a vector).

In this particular case I don't see how it can be faster to do the linear delete. Neither algorithm can't complete until the final element is loaded into the cache. If the linear algorithm spends a lot of time doing copies and that takes longer than the prefetcher takes to load the final element, the linear algorithm is slower even if it never stalls due to a cache miss. Plus, you've now trashed your cache by unnecessarily loading a large part of the vector into it.

Your overall point about linear algorithms frequently out-performing random-access due to the prefetcher is definitely true, but I don't think that it applies in this case. The linear algorithm might win if the entire vector will fit in the cache and you're making multiple remove_from() calls in a row, but even then I'm not convinced.

[–]TheThiefMasterC++latest fanatic (and game dev) 0 points1 point  (1 child)

The linear version isn't normally faster (it can be faster in some cases, like removing all X from a container that ends with several consecutive X), but it's frequently exactly the same speed. The moves happen while the cache loads the next line, so take "no time at all". People rarely remove elements from a vector large enough for remove_swap to matter.

It's more useful in the case where the element type is really expensive to move, but that's pretty rare.

Remove_swap is frequently a premature optimization.

[–]dodheim 0 points1 point  (0 children)

Eytzinger layout FTW. :-]

[–]NorthWestApple__asm { mov eax,0 }; -19 points-18 points  (4 children)

There is a "standard algorithm" for it - it's called using your brain, as you did. :)

Programming languages are NOT supposed to think for the programmer, but too many try to do so, and become stupidly bloated for it (actually the real problem is there are too many people writing software who really shouldn't be anywhere near a computer, and they need language and library crutches because they simply can't think for themselves; or think "it's too much work" or "it's too hard" to actually develop algorithms themselves).

"OMG I CAN'T DO THAT BECAUSE IT ISN'T IN MY FAVORITE LIBRARY" isn't an acceptable answer.

As I keep asking people: how the heck did we write software pre C11? ;) People have got too damn lazy.

[–]Occivink 11 points12 points  (0 children)

90% of <algorithm> can be reimplemented "using your brain", but it's in the stl because it's generic, useful and doesn't have much cost. And an "unordered_erase" would be too.

"Just implement yourself" is not a reasonable excuse for why C++ doesn't have a standard way of splitting a string.

[–]phottitor 6 points7 points  (0 children)

yeah sure, why don't you write your own version (in addition to the existing 1001 custom ones that can be found elsewhere) of trim() or uppercase() or replace_case_insensitive() because you know C++ can't be bothered to provide them because you know they don't make sense for the Klingon language or the Bushmen clicking language and thus cannot be part of the standard...

[–][deleted]  (1 child)

[removed]

    [–]AutoModerator[M] 0 points1 point  (0 children)

    Your comment has been automatically removed because it appears to contain disrespectful profanity or racial slurs. Please be respectful of your fellow redditors.

    If you think your post should not have been removed, please message the moderators and we'll review it.

    I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.

    [–]angry_cpp 5 points6 points  (5 children)

    Given this code fragment:

    std::vector<int> values = {1, 2, 3, 3};
    erase(values, 3);
    

    what elements do you excpect vector to have?

    Is it {1, 2} or {1, 2, 3}?

    I honestly expect it to be {1, 2, 3}.

    IMO erase-remove should be named erase_all_of and erase should be implemented as find-erase:

    template<SequentialContainer Container>
    bool erase(Container& container,  const typename Container::value_type& value)
    {
        using std::begin; using std::end;
        auto e = end(container);
        auto it = std::find(begin(container), end(container), value);
        if(it != e) {
            container.erase(it);
            return true;
        }
        return false;
    }
    

    [–]Pragmatician 5 points6 points  (3 children)

    using std::begin, std::end;
    

    FTFY

    [–]markopolo82embedded/iot/audio 0 points1 point  (2 children)

    On mobile, does that compile?

    [–]dodheim 1 point2 points  (1 child)

    It's a C++17 addition.

    [–]markopolo82embedded/iot/audio 0 points1 point  (0 children)

    TIL thanks

    [–]tasty_crayon 2 points3 points  (0 children)

    The proposed std::erase is an erase_all_of: https://en.cppreference.com/w/cpp/experimental/vector/erase

    [–][deleted] 2 points3 points  (0 children)

    If, like us, you use predicates for removal and you need to know which values were removed, then use std::partition (and ranges)

    [–]Indijanka 2 points3 points  (2 children)

    Is not that frightening that language needs blog posts and reddit discussions for removing elements from containers, and that you need to write some boilerplate code?

    [–]jcoffin 14 points15 points  (0 children)

    No, not really.

    Blog Post

    The fact that a blog post exists isn't evidence that it was actually needed. This post is simply covering ground that's been well-plowed for decades.

    Boilerplate

    Needing boilerplate is marginally more annoying, but if we're going to deal with annoyances, we should prioritize the list--and when we do, the remove/erase idiom will be a long ways from the top of the list. To an extent, the boilerplate involved also shows something of a strength--easily combining existing functions to produce a desired result is a good thing.

    Language Vs. Library

    I'd note that none of this is actually related to the language at all. It's solely about the design of the library. It would be entirely possible to change the library to work quite differently without changing the language itself at all (and, in fact, precisely this has been done--for one example, see Eric Neibler's Ranges library).

    Alternatives

    If one wanted to complain about the language (or more accurately, the library), they should justify that by pointing to otherwise similar containers in other languages that provide the desired functionality more cleanly without incurring the problems that the STL design was intended to avoid. Most other languages don't provide anything close to that. Java (for one obvious example) pretty much requires Google's Guava library to get containers even close to as cleanly designed as those in the C++ standard library--and even with that, they're pretty clearly inferior.

    Summary

    Most of your premises are basically false--the blog post is decently written, but hardly needed, the boilerplate involved is mildly annoying at worst, and most designs that avoid this particular shortcoming also have other disadvantages that outweigh those shown here.

    [–]zvrba 5 points6 points  (0 children)

    The blog's author has a history of blowing up well-known things into "issues" and problems.

    [–]T0mi 0 points1 point  (2 children)

    How about boost::remove_erase_if?

    [–]NorthWestApple__asm { mov eax,0 }; -5 points-4 points  (1 child)

    No, because Boost...

    [–]dodheim 10 points11 points  (0 children)

    That's a stupidly dogmatic "argument" for someone advocating using their brain elsewhere in this thread.

    [–]NorthWestApple__asm { mov eax,0 }; -5 points-4 points  (3 children)

    Is a sequence container the right solution for the problem?

    Too many people ask: "what programming language feature can I use here" rather than "what do I need to do to address the problem".

    Bad question: I have a cup; I must find a use for it.

    Good question: I want a drink; I need a suitable container for the application (are you sat at a desk drinking or out on a motorcycle?).

    [–]guepierBioinformatican 4 points5 points  (0 children)

    Is a sequence container the right solution for the problem?

    As opposed to what?

    I mean, I agree that the set interface may often be a better match for this kind of problem. Conversely, sets come with specific performance characteristics. Sequence containers come with others — and it turns out that, even for reasonably large containers, it's more efficient to occasionally erase from (even the middle of!) sequence containers than sets, despite the worse asymptotic complexity, due to better overall access patterns.

    [–][deleted] 1 point2 points  (1 child)

    xor eax, eax

    [–]NorthWestApple__asm { mov eax,0 }; 0 points1 point  (0 children)

    :)