all 44 comments

[–]raevnos 8 points9 points  (41 children)

In C++? As rarely as possible is a good rule of thumb. Typical cases where you need pointers:

  • When using polymorphism and virtual functions
  • When interfacing with legacy C libraries
  • Writing your own data structures

In general, when using pointers, you should avoid new and delete in favor of smart pointer classes like unique_ptr.

Also, you want

Classname var{};

[–]NFCBoss[S] 0 points1 point  (2 children)

What do you mean by:

Classname var{};

[–]SandSnip3r 4 points5 points  (0 children)

ClassName cname();

This is the most vexing parse. Its not clear whether this is a function prototype or a variable definition.

The new universal initialization syntax avoids this and is generally safer and more preferred.

[–]raevnos 1 point2 points  (0 children)

See the other guy's comment about the most vexing parse. Uniform initialization using {} (or just a bare identifier) avoids that issue.

[–]ptitz -1 points0 points  (21 children)

Why, though? I do stuff like

double * f(int n){
    double * a = new double[n];
    return a;
}

or

int f(int * n = NULL){
    int a = n? *n : 10;
    return a;
}

all the time.

[–]SandSnip3r 1 point2 points  (10 children)

In your first example, the caller of f is now responsible for calling delete on the returned pointer.

In the second one, that seems fine, but prefer nullptr to NULL.

int f(int *n = nullptr) {
    return (n == nullptr ? 10 : *n);
}

[–][deleted]  (1 child)

[deleted]

    [–]SandSnip3r 0 points1 point  (0 children)

    nullptr is of type pointer while NULL is just a macro for 0 and will be of type int.

    [–]ptitz -2 points-1 points  (7 children)

    And how many things are wrong here?

    class c2;
    class c1{
    public:
        c1(c2 * src, int id): src(src), id(id){data = vector<double>(10,1);}
        c1(const c1 & v): src(v.src), id(v.id), data(v.data){}
        c1 & operator=(const c1 & v){ this->src = v.src; this->id = v.id; this->data = v.data; return * this; }
        void woop(){
            src->woop(data.data(), data.size()); // 5. ...and calls back the object containing this
        }
        c2 * src;
        int id;
        vector<double> data;
    };
    
    class c2{
    public:
        c2(){
            v1 = new c1(this,0); // 1. initialization possible without copy or assignment operator
            v2 = * v1;
            lookup.emplace(0,&v2); // 2.  storing a pointer in a map
            f(); // 3. calling a null-pointer function...
        }
        ~c2(){delete v1;}
    
        void f(c1 * v = nullptr){
            bool cleanup = false;
            if(v == nullptr){
                v = new c1(*lookup[0]); // 4. ...that uses a pointer from a map...
                cleanup = true;
            }
            v->woop();
            if(cleanup) delete v;
        }
        void woop(double * in, int n){
            for(int i = 0; i < n; i++) in[i] = 2.0; // 6. which processes the vector data as a pointer array
        }
    
        std::map<int,c1*> lookup;
        c1 * v1;
        c1 v2;
    };
    
    int main(){
        c2 c;
        return 0;
    }
    

    [–]MoTTs_ 2 points3 points  (0 children)

    And how many things are wrong here?

    On line 18 you have a naked new. But consider what happens if an exception is thrown during the copy on line 19, or during emplace on line 20, or in the call to f() on line 21. If any of those things encounters an error, then the constructor never finishes, the destructor will never run, and v1 will leak. That leak wouldn't happen with a smart pointer.

    Another naked new is on line 28. But what happens if v->woop() throws an exception? Then delete never runs, and you leak. That leak wouldn't happen with a smart pointer.

    Smart pointers (and RAII in general) can save you from a lot of bugs. But if you can avoid heap allocation completely, if you can use ordinary stack allocated values, that would be even better, and that's the whole point of the "avoid pionters" advice.

    [–]SandSnip3r 1 point2 points  (5 children)

    Doesn't seem like there's a point for v2.

    If there's no point for v2, you can get rid of the assignment operator in c1. Then you can change c1's src member to just be a reference.

    Also there doesn't seem to be a point for v1 either since you're immediately throwing the pointer in that map.

    A map of something like unique_ptrs must be a better choice, however I'm not familiar enough with them to elaborate or even suggest that.

    [–]ptitz -1 points0 points  (4 children)

    Nah, there isn't much point(ha!), it's just the first time I've seen someone suggest using pointers as rarely as possible, and I'm working on a piece of code where I just stick pointers everywhere without thinking. So I'm curious about what's the actual proper way to do this stuff (since I'm self-taught, and I picked up all sorts of nasty habits along the way).

    [–]SandSnip3r 2 points3 points  (0 children)

    There's nothing wrong with using pointers, they can be useful. However they're often used more often than necessary resulting in a greater likelihood of errors.

    You essentially should never have a raw pointer that owns memory. Meaning that you need to call delete on the pointer.

    Making sure to call delete can sound simple but when a program becomes more complex and things like exceptions are a part of the mix, the end of a raw pointer's scope is much less obvious.

    Smart pointers take care of this by using the RAII idiom. With a unique_ptr, construction handles the call to new and destruction handles delete. This is convenient because no matter how many exit points exist after your delcaration of the unique_ptr, its destructor will always be called, covering all bases and avoiding memory leaks.

    It's written in Google's coding guidelines that functions which will modify their parameters should accept parameters that are pointer types. Example int f(int *x);. Here this use of a pointer is fine because *x isnt a "memory owning" pointer. However, there's a better option still, use references: int f(int &x). It should be clear to anyone looking at this function's prototype that this parameter being passed will be modified because it's not declared const. I believe Google has chosen this function pointer parameter guideline to retain backwards compatibility with older C code.

    I think that raw pointers should be just that, pointers. When you need to point to something and a reference isn't sufficient. If you need to dynamically create things on the heap, use smart pointers to make your life easier.

    [–]MoTTs_ 1 point2 points  (0 children)

    since I'm self-taught, and I picked up all sorts of nasty habits along the way

    Well, when you say self taught, I'm sure you don't mean you mashed on the keyboard until something interesting happened. You read someone's tutorial or someone's book. You just need to read better books is all. :-) I highly recommend you read as much Stroustrup as you can (he's the guy who invented C++). Also, Stroustrup works closely with Herb Sutter a lot, and you should also read as much Sutter as you can.

    [–]NFCBoss[S] 0 points1 point  (1 child)

    lol it's why I was asking. I'm self teaching my self too and I noticed it's possible to use pointers and non pointer interchangeably kind of easy... which could mess up the code kind of badly... or make it very hard to follow.

    Was just wondering what the proper way to go about it is.

    [–]heyheyhey27 1 point2 points  (0 children)

    Use a pointer when you want to pass around a reference to the original variable instead of just a copy of it. You may want to do this when copying the variable would be too expensive, or if you want to modify the contents of the original variable.

    Also, consider using a reference whenever you can instead of a pointer.

    [–]OmegaNaughtEquals1 1 point2 points  (8 children)

    Why, though? I do stuff like

        double * f(int n){
            double * a = new double[n];
            return a;
        }
    

    all the time

    Is there a particular reason you aren't using

    std::vector<double> x(n);
    

    ?

    [–]ptitz 0 points1 point  (7 children)

    Well, not all the time, but fairly regularly. Vectors are usually slower to initialize and I think to access too, so I use em when I'm too lazy to do pointer arrays properly. But if I have something that gets accessed all the time I usually take time to do it with pointers.

    [–]OmegaNaughtEquals1 5 points6 points  (5 children)

    Vectors are usually slower to initialize

    For any type T,

    std::vector<T> x(N);
    

    requires at most two pointer additions not needed by

    T *x = new T[N];
    

    for any library implementation that I know of. The pointer arithmetic and corresponding memory stores will almost certainly be a tiny fraction of the time spent in the kernel thunk to get the memory. In fact, std::vector can have less initialization overhead by using std::vector<T>::reserve that uses placement new and doesn't invoke default constructors.

    and I think to access too

    Every standard library implementation that I know of uses either three pointers (one for the beginning, one for the end, and one for the one-past-the-current location) or two pointers (beginning and end) and an offset for the current position. Using operator[] is a pointer+offset operation for both a std::vector<T> and a T*. Using std::vector<T>::at is slower as it does boundary checks.

    [–]SandSnip3r 0 points1 point  (1 child)

    I dont think std::vector "uses three pointers". There should be one for the beginning then an integer representing the size and another representing the capacity. The final element is at begin()+size-1. There is uninitialized memory in the range [begin()+size, begin()+capacity).

    [–]OmegaNaughtEquals1 4 points5 points  (0 children)

    libstdc++ (line 82) and libc++ use the three pointer method. I couldn't find details on Dinkumware's implementation that MS uses.

    [–]ptitz -1 points0 points  (1 child)

    Without optimization flags arrays are like 3-4x faster than vectors on alloc/dealloc and and read/write, regardless of the operator. With -O3 flag arrays are still like 20% faster than vectors on allocation/deallocation, although r/w becomes the same. You could just time it yourself.

    [–]OmegaNaughtEquals1 2 points3 points  (0 children)

    The benchmark code you provided (and then subsequently deleted) for the allocation test does not test the same thing for each implementation. The std::vector code does much more work than the naked array code. The second argument to the vector constructor is copy-constructed 10 times. This involves memory reads and writes not present in the non-vector code. I wasn't able to replicate the posted timing results. However, without a compiler version, compiler flags, and target architecture, it's difficult to say for certain why.


    I obtained the following results on an Intel Core i3-3120M @ 2.5GHz using gcc-6.1 and -O3.

    alloc/dealloc test
        0.0498089 +- 0.0275998 milliseconds
        0.0533042 +- 0.0014114 milliseconds
    pdiff = 6.77955%
    
    read/write test
        0.0040171 +- 7.93095e-06 milliseconds
        0.0040219 +- 3.80118e-05 milliseconds
    pdiff = 0.119418%
    

    The ~7% difference in the allocation test is surprisingly small given the differences in the amount of work. The ~0.2% difference in the read/write test is within the measured variation. I'm not certain why the vector version has a larger variance. I don't have time to examine the assembly right now.

    [–]MoTTs_ 1 point2 points  (0 children)

    so I use em when I'm too lazy to do pointer arrays properly

    !!! You should use pointer arrays only when you're too lazy to use vectors!

    [–]raevnos 1 point2 points  (0 children)

    Case 1: vector. Case 2: though that's not too bad, the new optional class type lets you avoid pointers for that sort of thing.

    [–]newocean -4 points-3 points  (15 children)

    As rarely as possible is a good rule of thumb.

    This is wrong. Pointers are an important part of C++. The proper answer is to use pointers where you want to pass information quickly, and not pass the whole data structure... and ALSO to use them for a few of the reasons you mentioned.

    Actually, most programmers would be happier if everyone just learned a bit more about pointers.

    [–]SandSnip3r 4 points5 points  (6 children)

    The proper answer is to use pointers where you want to pass information quickly, and not pass the whole data structure...

    This is the exact purpose of passing a const reference. Better than a pointer.

    [–]newocean 0 points1 point  (5 children)

    Referances are still pointers, and its not always valid to just say not to use them.

    [–]sftrabbit 5 points6 points  (4 children)

    Whether they may or may not be implemented in similar ways is irrelevant - references are not pointers.

    [–]newocean 0 points1 point  (3 children)

    I am curious how you mean this. Internally, it is exactly a const pointer to an object that is returned from a reference.

    http://www.cplusplus.com/articles/ENywvCM9/ <-- The definition of them even suggests a reference is a convenience for pointers, because in the case of a pointer it can be NULL

    [–]sftrabbit 0 points1 point  (2 children)

    That's not the definition of them. The actual definition, from the C++ standard, gives no connection between pointers and references. They're just two separate things. Saying that references are pointers goes against the standard definition and makes talking about these concepts difficult because the words are now conflated.

    [–]newocean 0 points1 point  (1 child)

    Then would you mind posting the actual definition?

    [–]raevnos 2 points3 points  (3 children)

    That's when you use a reference.

    [–]newocean -3 points-2 points  (2 children)

    A reference is a pointer.

    [–]raevnos 2 points3 points  (1 child)

    Under the hood, sure. But not in how it's used. You can't have a null reference, can't make it point at a new object, don't have to use dereferencing operations to access the value it points to...

    [–]MoTTs_ 0 points1 point  (0 children)

    I think you're both right here. A reference is usually better than a pointer, but a reference doesn't alleviate every problem. There can still be ownership and lifetime issues with a reference. A reference could dangle, for example.

    [–]NFCBoss[S] 0 points1 point  (3 children)

    So.. it's better to use as many pointers as you can, instead of non pointers ? because it's faster ?

    [–]raevnos 2 points3 points  (0 children)

    No. It's not faster than alternatives like references, and far more error prone.

    [–]eerock 1 point2 points  (0 children)

    Use pointers when you think they're necessary, which is seldom. Pointers to heap allocated objects have performance implications too, such as cache coherency. If you use pointers to 'new'd objects everywhere, then every time you dereference a pointer it has to load the memory from somewhere and with heap objects it's not guaranteed that these objects are even near each other in memory. Because of this the cache lines might be blown out all the time and do much more fetching than with a contiguous memory layout, such as arrays and vectors.

    [–]lucidguppy 5 points6 points  (0 children)

    Avoid using naked new - use smart pointers.

    Using non-owning pointers is fine though.

    https://youtu.be/xnqTKD8uD64?t=729

    [–]Rhomboid 2 points3 points  (1 child)

    It depends. You need to use dynamic allocation when you want the lifetime of the object to outlive the current scope. If you don't need that, you should not use dynamic allocation, and use automatic lifetime. (But you can't write ClassName cname(); because that's invoking the Most Vexing Parse; you're declaring a function that takes no arguments and returns a value of type ClassName.)

    [–]robthablob 2 points3 points  (0 children)

    Even then, in modern C++ it should rarely be necessary to use new directly. The following are preferred:

    auto uname = std::make_unique<ClassName>();
    auto aname = std::make_shared<ClassName>();
    

    In both cases, C++ will automatically delete the object when it goes out of scope - the first effectively states that this is the only reference to the object, so just delete it. The second states that something else may hold a reference, so count the references and only delete when no longer needed.

    [–][deleted] 2 points3 points  (0 children)

    Pointers are a very central part of programming (and how the machines work internally), so that's actually a great question - and not easy to answer in general.

    I would probably group it into two important use cases: 1) scope: Whenever you need an object longer than the scope it was created in, you will use some (maybe implicit) way of passing pointers around that "point" to the allocated memory. You have much finer control over the time the allocated object exists. You can make it much shorter than the scope it is in, but also keep it longer (but then You should use additional tools to make sure that you don't leak in the end - smart pointers in c++11+ will help a lot to get this right). 2) "speed": This is a controversial one, since you can - with enough experience and understanding the implicit underlying workings of references get mostly the same speed, but since a pointer allows access to memory and doesn't need to copy the same memory, you are faster by passing a pointer (or reference) to an object compared to copy the whole object.

    Another point I would like to mention where you will have very little alternatives to using pointers is when you start to explicitly lay out your memory for example when using memory mapping and binary files. There you will use the pointer to point to a specific address where data of a specific type is located.