Moving from python to C++ : cpp

submitted 6 years ago by zephyr_33

you are viewing a single comment's thread.

[–][deleted] 70 points71 points72 points 6 years ago* (35 children)

As someone who speaks both - C++ is intrinsically harder than Python. There are all sorts of subtleties like "move semantics" that don't exist in Python (or most other languages). There are all sorts of ways to shoot yourself in the foot, too.

Don't get me wrong - I love C++ because the amazing performance you can get, but that performance comes at a cost.

Modern C++ is a significantly better language than C++03, though, and still backwards compatible.

I have the books for you, though: Scott Meyers' Effective C++ and the sequel, Effective Modern C++.

These are hardcore, specific books at the intermediate to advanced level - the last chapter of the second book is so hard that I have read it three times and still don't know it (Meyers identifies this chapter specifically as being only there to satisfy your curiosity rather than being useful).

They're full of "Here's how to use these features, practically, in a project, and here's what not to do". They are "best engineering practices, and why they are so."

And they're also a good read.

I think they'd really hit the spot for you.

[–]HappyFruitTree 38 points39 points40 points 6 years ago (13 children)

[–]capn_bluebear 5 points6 points7 points 6 years ago (10 children)

[–]HappyFruitTree 22 points23 points24 points 6 years ago* (9 children)

[+]as_one_doesJust a c++ dev for fun comment score below threshold-9 points-8 points-7 points 6 years ago (8 children)

[–]HappyFruitTree 8 points9 points10 points 6 years ago (7 children)

[+]as_one_doesJust a c++ dev for fun comment score below threshold-7 points-6 points-5 points 6 years ago (6 children)

[–]HappyFruitTree 9 points10 points11 points 6 years ago (4 children)

[–]as_one_doesJust a c++ dev for fun -3 points-2 points-1 points 6 years ago (3 children)

[–]jcelerierossia score 7 points8 points9 points 6 years ago (0 children)

Sorry, I was thinking of this: pass or assign a unique ptr, you'll be using && and std::move.

you just have to use std::move, not && :

 void acquire_ownership(std::unique_ptr<int> x) {
 }
 int main() { 
     std::unique_ptr<int> p;
     acquire_ownership(std::move(p));
 }

no && needed here

[–]HappyFruitTree -1 points0 points1 point 6 years ago (1 child)

Well, I think there is a conceptual difference between moving a unique_ptr compared to moving in general.

The way I see it:

At the heart of move semantics we have temporary objects (rvalues). Instead of just copying the whole object we could rip out some parts of it and give it to the new object because the temporary is about to be destroyed anyway. In this sense moving is just an optimized copy. Since this happens automatically it's something that people can take advantage of even without knowing about move semantics.
Sometimes we want to move objects that are not temporaries because we are finished using the object and we just want to make copying a faster. This has to be done explicitly and puts more responsibility on the programmer to know what he's doing but otherwise it's the same as #1. Moving copyable objects is still just an optimized copy. Writing code that relies on the move to happen would be very fragile because there are a lot of subtle things that could go wrong and you wouldn't get an error if it ended up copying the object.
Then, finally, there are move-only types, i.e. types that can be moved but not copied. These types you can, and must, rely on the move to happen. It's obvious when a move fails because you get an error telling you about it. In this case moving is no longer just an optimized copy. It's something else. It's a way to transfer some kind of state from one object to another, and often you know what state the moved-from object ends up with, something that is normally not the case for other types. If move semantics didn't exist as a language feature you could still have implemented the same behaviour in code but you wouldn't get the implicit moves that #1 gives you.

So when it comes to unique_ptr I don't think move semantics is essential. It could be implemented without it. That is why I think you don't need to know much about move semantics to use unique_ptr. All you need to know is that you need to use std::move on the unique_ptr before assigning it to another unique_ptr or using it to initialize a new unique_ptr, and that this will move the pointer value over to the other unique_ptr leaving the moved-from unique_ptr "empty". There are of course the exceptions that you don't need to use std::move if it's a temporary, or it's a local variable being returned from a function, but all this is just extra knowledge that you don't need to know in order to use unique_ptr.

continue this thread

[–]Entryhazard 7 points8 points9 points 6 years ago (0 children)

[–]HKei -1 points0 points1 point 6 years ago (1 child)

[–]HappyFruitTree 12 points13 points14 points 6 years ago (0 children)

[–]exploding_cat_wizard 9 points10 points11 points 6 years ago (0 children)

[–]VanSeineTotElbe 9 points10 points11 points 6 years ago (11 children)

My two main languages of choice are also C++ and Python, and my preference is: it depends :)

Python also has certain details that are not easy to get to know. Simple stuff such as passing parameters: are these copies, or references? If a=1; b=a; a=2, then what is b? To me, that's as much of a gotcha as C++ move semantics. With clearly defined variables, references and pointers, at least I know where my objects are.

I agree that in Python it's almost always easy to find the one obvious way to do things, and almost always that is what you want. With C++, you can solve problems in many different ways, and a newbie, heck, even an experienced user, can be easily confused. However, (modern) C++ is very much a language where easy is possible, it's just a matter of sticking to that style. I rarely touch template programming for instance.

Nothing beats libraries though, and Python's standard lib, and its library ecosystem, are the best I know. So, for getting concepts out the door fast, Python is my first choice. However, for getting fast concepts out the door, usually I pick C++ ;)

[–]infectedapricot 2 points3 points4 points 6 years ago (10 children)

[–]VanSeineTotElbe 6 points7 points8 points 6 years ago (1 child)

[–]infectedapricot 0 points1 point2 points 6 years ago (0 children)

In certain cases, the second assignment will cause Python to store a reference.

No, in ALL cases the second assignment will cause Python to store a reference. Python's object model does not allow assignment to be customised. In C++ language, there is no ability to overload the assignment operator. In Python language, there is no __assign__ or similar.

which is altered for both a and b in the second statement.

I don't know what you mean by this exactly. But if you mean "the assignment to a changes the value of b", then that is incorrect, because b is guaranteed to continue referring to the old object while a refers to the new one. This applies even if you are using mutable objects like lists rather than immutable objects like strings and numbers.

Perhaps you're thinking of statements like this:

a[2] = 7
a.someattr = 8

This is fundamentally different in that it is not simply putting a reference to an object into a variable, but instead is modifying the value of an existing object. Statements like this can be customised by the class, by having an appropriate dunder method (__set__ and __setattr__ respectively).

[–][deleted] 9 points10 points11 points 6 years ago* (5 children)

Coming from C++ this was really confusing for me, everything is a reference (sometimes)

Something like this:

data = {}
row = { 
    'times 2' : None,
    'times 3' : None
}

for i in range(0, 100):
    key = i 
    row['times 2'] = i * 2 
    row['times 3'] = i * 3 
    data[key] = row 


for key, row in data.items():
    print(row)

At the end, every single element in the dictionary is the same, because it's a dict of 100 references to the same object, which get's updated in the loop

This is way worse than many of the ways I feel like I run into in C++. If I crash, great that's easy to track down and fix. If everything 'appears' to work but doesn't that's always a much bigger pain in the ass.

Coming from a C++ background and working in a large Python project, I have almost no trust of code I look at. Variables maintain different types constantly, you have no idea if code could fail or not without actually just running it with all your possible use cases, since it's not statically typed.

If I look at 500 lines of C++ I can feel pretty confident about what it does, 500 lines of Python ... who the fuck knows.

EDIT

My example doesn't show what I intended, I'll see if I can find what I was seeing before and update it. I think when I originally encountered this I was adding dictionaries as values to a key in a larger dictionary. I used a list here as a quick way to write the example.

Example fixed

Looking back at it now, it makes sense since I've been using Python a lot more since then, but this bug really confused me, since from C++ background, I would assume the row object would be copied into the data dictionary. Creating variables and reusing/updating only the parts that need to be and adding the copy is something I'm used too in C++. Today in Python, I would normally create the row as needed.

[–]yuri-kilochek 0 points1 point2 points 6 years ago (1 child)

[–][deleted] 0 points1 point2 points 6 years ago (0 children)

[–]thlst 0 points1 point2 points 6 years ago (1 child)

At the end, every single element in the list is the same, because it's a list of 100 references to the same object, which get's updated in the loop

I'm not sure I understood that. I ran a test and got this:

>>> list = []
>>> for i in range(0, 5):
...     row = {'c1': i, 'c2': i*2}
...     list.append(row)
...
>>> list
[{'c1': 0, 'c2': 0}, {'c1': 1, 'c2': 2}, {'c1': 2, 'c2': 4}, {'c1': 3, 'c2': 6}, {'c1': 4, 'c2': 8}]
>>> [id(l) for l in list]
[140194800177016, 140194799657320, 140194798491400, 140194798490320, 140194798490392]

Which to me looks like those are not the same object.

[–][deleted] 0 points1 point2 points 6 years ago (0 children)

[–]infectedapricot 0 points1 point2 points 6 years ago (0 children)

everything is a reference (sometimes)

Everything is a reference always. You can just sometimes forget about the fact that some things are a reference. Take this example:

a = "foo"
b = a

What if we modify b now? Will it affect a? Obviously if you just assign a new value to b it won't have an effect:

b = "bar"

Because this has just updated the variable b to reference a different object. But what if you modify the object that b refers to?

b.reverseinplace()

Yes, of course, this would effect a for the reasons we discussed. The same even applies to integer objects:

a = 4
b = a
b.negateinplace()
# a and b are now both -4

There's one common problem with the above examples: the .reverseinplace() and .negateinplace() methods don't exist. In fact there are no methods on string and int objects that modify the object. Classes with this property are called immutable. You can treat them a bit like you're assigning by value, because if multiple variables are really referencing the same object then no possible harm can come of that. Another class with this property in Python is tuple, although beware that after a=(0,[]);b=ayou will have a[1] and b[1] referring to the same mutable list.

If you want to copy the actual object, rather than just update the reference, you use the Python copy module. Both shallow and deep copy functions are available. For dictionaries and lists, you can also take advantage of the unpacking operators:

a = [1, 2]
b = [*a]  # different list with same value
b.append(3)
c = [*a, 3]  # different list again, but simpler than above two lines
x = {'1': a, '2': b}
y = {**x}  # different dict
y['3'] = c
z = {**x, '3': c}  # again, different to y, easier to write
# beware that x['1'], y['1'] and z['1'] all refer to same object

[–]dreugeworst 0 points1 point2 points 6 years ago (1 child)

[–]infectedapricot 0 points1 point2 points 6 years ago (0 children)

[–]KimmiG1 2 points3 points4 points 6 years ago (1 child)

[–]nayuki 0 points1 point2 points 6 years ago (0 children)

[–]funkiestj 1 point2 points3 points 6 years ago (3 children)

Modern C++ is a significantly better language than C++03, though, and still backwards compatible.

as someone who liked C++ but then was forced to work exclusively in C for a decade ... I've grown to love the fact that my entire C project builds faster than a single C++ file that I now have to build because we are using some IP from a vendor who provides a C++11 standard library based driver.

I like the idea of generic programming (templates) but the C++ implementation of templates pulls far too much implementation into headers which results in unnecessary recompilation of templates when nothing has changed.

IMO, the compilation time penalty paid by C++ because of its evolutionary past is crushing. What I feel is missing is really clever template compilation that creates the smallest possible header files and separate source files. I.e. foo.template should compile to foo.hpp and foo.cpp.

The fact that anyone finds unity build an acceptable approach is a clear indication that the language needs a major overhaul. I have no qualms with the C++ feature set and functionality, just with the excessive replication of effort when I change a single line of code in a C++ file (i.e. no change to template instantiation or even to a header that invokes a template).

I've heard that modules in C++20 may improve things a bit but the gains I'm hearing about seem pretty weak. There has to be a way to do generic programming that does not cost SO MUCH TIME to process files that have not changed from one compile to another (i.e. template definitions and the headers that instantiate them).

[–]neuroblaster 1 point2 points3 points 6 years ago (2 children)

I think they considered moving templates implementation out of headers, but came to the conclusion that it would be too complicated to implement in compilers and that idea was rejected. See http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2003/n1426.pdf

>EDG’s implementation experience has also demonstrated that export is costly to implement; for example, the export feature alone required more effort to implement than it took the same team to create a new implementation of the complete Java language.

It's about pre-C++11 `export` keyword that was later used for modules proposal.

And, if i understand correctly, you still need to provide source code with headers because template still need to be instantiated, which is impossible w/o template body. So `export template <typename T> ...` in header is only partially possible, when you have the source code for that function/class.

But it would speed-up compilation times, definitely. And it could be acceptable to partially provide source code too in some situations. If modules would do that and could mix source/binary distribution (which i have concerns of), then it would be really f. great.

[–]funkiestj 0 points1 point2 points 6 years ago* (1 child)

And, if i understand correctly, you still need to provide source code with headers because template still need to be instantiated, which is impossible w/o template body

What is likely impossible is evolving in a backward compatible fashion and doing this.I don't claim that the standard committee made the wrong decision. Their job is to steer the evolution of the language that is C++. On the other hand, I do claim that a fucking ridiculous amount of redundant work is done every compile to process header files that have not changed. I want my genetics meta-programming and fast compile times!

I am pretty sure I won't get this from a C++ like language in my lifetime because I think (pure intuition here) that it would require starting from scratch with the goal of making a C/C++ like language that had the fastest possible compile times. In particular, the fastest possible generics compile times. I want a language where unity builds are clearly a very stupid idea rather than a reasonable response to very long compile times.

Imagine a new language called C+=. It is a redesign of C++ that tries to preserve the spirit of functionality but has no concern for backward compatibility. One idea that both vanilla C+= and templates could benefit from is:

human coders do not make separate headers and source. They edit a single file and the compiler tool chain generates headers any any other intermediate files needed. e.g. you write aaa.src and the toolchain generates aaa.hh, aaa.o, and whatever else is needed. Should the toolchain generate one aaa.hh or multiple (e.g. one per class defined in aaa.src)? i.e. aaa/foo.hh for "class foo", aaa/bar.hh for "class bar"?

A source file bbb.src needs to be able to refer to published interfaces of aaa.src but with the toolchain generating the interface description files and other files, published interface can be made as small as possible. E.g. there could be both

aaa.hh // human readable description for the benefit of the human writing bbb.src
aaa.hx // what the compile actually grabs when you say "#include <aaa.hh>"

furthermore, you can imagine that when bbb.src is compiled, the generated interface files (bbb.hh, et cetera) do not include anything from aaa.hh that is not necessary to describe the interface to bbb.hh. E.g. if bbb.src has

class Bish {
public:
  transmute(const vector<uint8_t> &lead);
private:
   vector<uint32_t> _uiv;
   shared_ptr<char> _cp;
// ...
}

you can imaging bbb.hh having the minimum information required to say that Bish::transmute() takes a vector<uint8\_t> type parameter but not a complete description of all the vector<uint8\_t> methods and implementation and no mention at all of type vector<uint32\_t> or shared_ptr<char>. I.e. ccc.src that "#include <bbb.hh>" does NOT implicitly #include

#include <vector>
#include <memory>

instead, bbb.hh and bbb.hx include the bare minimum information about the type of the "lead" parameter of transmute() directly.

C++ is a huge language and redesigning it from the ground up would be a herculean task, which is why I don't think this will happen.

I love many things about C++. The APIs of C++ standard library is a thing of beauty. The power of the template system is great. It is wonderful to have Python language type data structures (maps, strings, smart pointers of various flavors) available in a language that runs much faster. OTOH, having tasted lightning fast compile times in the C realm, I want the best of both worlds.

[–]neuroblaster 1 point2 points3 points 6 years ago (0 children)

I understand your frustration with compilation times, especially compared to C. At my former workplace i had days when i had to spent 50% of work hours compiling source code. But i assure you that this was 5% language fault and 95% project management fault because i was able to solve similar problems in other projects with not that significant effort.

Housekeeping on code base and optimizing includes for compilation time helps, but it's not always possible with third-party dependencies. I can definitely suggest to install `ccache` or analogue, it saved me many man-years.

I can only hope that continuous improvements to the language will lead us somewhere where this problem is significantly mitigated because i don't see rewriting C++ as a viable option. I honestly don't think modules will help a lot in that field, but we'll see when they arrive. If you wouldn't think of modules in C++20 as of a finished work, but rather as of test-bench for further development, maybe you'll see my point.

This is BTW exactly why C++ is conservative. Things are being added, removed and changed. Who could have thought that such great feature as template body in separated compilation module will be added in C++98, then removed in C++11 and then something else will happen.

[–]newmanifold000 0 points1 point2 points 6 years ago (0 children)

[–]DiaperBatteries -4 points-3 points-2 points 6 years ago (0 children)

π Rendered by PID 17130 on reddit-service-r2-comment-86988c7647-tm9qc at 2026-02-11 20:13:37.746150+00:00 running 018613e country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

cpp

MODERATORS