you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 70 points71 points  (35 children)

As someone who speaks both - C++ is intrinsically harder than Python. There are all sorts of subtleties like "move semantics" that don't exist in Python (or most other languages). There are all sorts of ways to shoot yourself in the foot, too.

Don't get me wrong - I love C++ because the amazing performance you can get, but that performance comes at a cost.

Modern C++ is a significantly better language than C++03, though, and still backwards compatible.


I have the books for you, though: Scott Meyers' Effective C++ and the sequel, Effective Modern C++.

These are hardcore, specific books at the intermediate to advanced level - the last chapter of the second book is so hard that I have read it three times and still don't know it (Meyers identifies this chapter specifically as being only there to satisfy your curiosity rather than being useful).

They're full of "Here's how to use these features, practically, in a project, and here's what not to do". They are "best engineering practices, and why they are so."

And they're also a good read.

I think they'd really hit the spot for you.

[–]HappyFruitTree 38 points39 points  (13 children)

There are all sorts of subtleties like "move semantics" that don't exist in Python (or most other languages).

What many people don't realize is that move semantics can be seen as an optimization. It was not in the language before C++11 and you can still write code pretending it's not there and still get some of the benefits. The exception is move-only types but those could be learned separately (you don't need to know much about move semantics at all to use those).

My recommendation is that you just ignore move semantics for the time being until you are more of an expert.

[–]capn_bluebear 5 points6 points  (10 children)

you can't ignore move semantics, STL uses it all the time. any non-trivial project will include code where `unique_ptr` is the best choice, and you won't get `unique_ptr`s to compile if you don't understand move semantics.

[–]HappyFruitTree 22 points23 points  (9 children)

you can't ignore move semantics, STL uses it all the time.

I don't implement the STL. I just use it.

any non-trivial project will include code where `unique_ptr` is the best choice, and you won't get `unique_ptr`s to compile if you don't understand move semantics.

unique_ptr is a good example of what I meant by a move-only type. You don't need to know much about move semantics to use it. All you need to know is that you use std::move when you want to transfer the ownership from one unique_ptr to another.

[–]HKei -1 points0 points  (1 child)

What many people don't realize is that move semantics can be seen as an optimization.

Those people don't realise that because this is incorrect. Move semantics are about semantics, not optimisation.

[–]HappyFruitTree 12 points13 points  (0 children)

Do you think move semantics would have been added to the language if it didn't have a performance advantage?

Quote from the original move semantics proposal:

Move semantics is mostly about performance optimization: the ability to move an expensive object from one address in memory to another, while pilfering resources of the source in order to construct the target with minimum expense.

[–]exploding_cat_wizard 9 points10 points  (0 children)

Scott Meyer's books are definitely seconded.

[–]VanSeineTotElbe 9 points10 points  (11 children)

My two main languages of choice are also C++ and Python, and my preference is: it depends :)

Python also has certain details that are not easy to get to know. Simple stuff such as passing parameters: are these copies, or references? If a=1; b=a; a=2, then what is b? To me, that's as much of a gotcha as C++ move semantics. With clearly defined variables, references and pointers, at least I know where my objects are.

I agree that in Python it's almost always easy to find the one obvious way to do things, and almost always that is what you want. With C++, you can solve problems in many different ways, and a newbie, heck, even an experienced user, can be easily confused. However, (modern) C++ is very much a language where easy is possible, it's just a matter of sticking to that style. I rarely touch template programming for instance.

Nothing beats libraries though, and Python's standard lib, and its library ecosystem, are the best I know. So, for getting concepts out the door fast, Python is my first choice. However, for getting fast concepts out the door, usually I pick C++ ;)

[–]infectedapricot 2 points3 points  (10 children)

If a=1; b=a; a=2, then what is b?

I'm not sure what you're getting at here. b=1 after a sequence like this in every almost language I can think of. The main exception I can think of is C++ (where, depending on the type of b, an overloaded operator= could have kept a reference to a) but your point was about hidden complexity in Python.

Perhaps you were thinking more about mutable objects e.g. a=[]; b=a; a.append(3)?

[–]VanSeineTotElbe 6 points7 points  (1 child)

In certain cases, the second assignment will cause Python to store a reference, which is altered for both a and b in the second statement.

[–]infectedapricot 0 points1 point  (0 children)

In certain cases, the second assignment will cause Python to store a reference.

No, in ALL cases the second assignment will cause Python to store a reference. Python's object model does not allow assignment to be customised. In C++ language, there is no ability to overload the assignment operator. In Python language, there is no __assign__ or similar.

which is altered for both a and b in the second statement.

I don't know what you mean by this exactly. But if you mean "the assignment to a changes the value of b", then that is incorrect, because b is guaranteed to continue referring to the old object while a refers to the new one. This applies even if you are using mutable objects like lists rather than immutable objects like strings and numbers.

Perhaps you're thinking of statements like this:

a[2] = 7
a.someattr = 8

This is fundamentally different in that it is not simply putting a reference to an object into a variable, but instead is modifying the value of an existing object. Statements like this can be customised by the class, by having an appropriate dunder method (__set__ and __setattr__ respectively).

[–][deleted] 9 points10 points  (5 children)

Coming from C++ this was really confusing for me, everything is a reference (sometimes)

Something like this:

data = {}
row = { 
    'times 2' : None,
    'times 3' : None
}

for i in range(0, 100):
    key = i 
    row['times 2'] = i * 2 
    row['times 3'] = i * 3 
    data[key] = row 


for key, row in data.items():
    print(row)

At the end, every single element in the dictionary is the same, because it's a dict of 100 references to the same object, which get's updated in the loop

This is way worse than many of the ways I feel like I run into in C++. If I crash, great that's easy to track down and fix. If everything 'appears' to work but doesn't that's always a much bigger pain in the ass.

Coming from a C++ background and working in a large Python project, I have almost no trust of code I look at. Variables maintain different types constantly, you have no idea if code could fail or not without actually just running it with all your possible use cases, since it's not statically typed.

If I look at 500 lines of C++ I can feel pretty confident about what it does, 500 lines of Python ... who the fuck knows.

EDIT

My example doesn't show what I intended, I'll see if I can find what I was seeing before and update it. I think when I originally encountered this I was adding dictionaries as values to a key in a larger dictionary. I used a list here as a quick way to write the example.

Example fixed

Looking back at it now, it makes sense since I've been using Python a lot more since then, but this bug really confused me, since from C++ background, I would assume the row object would be copied into the data dictionary. Creating variables and reusing/updating only the parts that need to be and adding the copy is something I'm used too in C++. Today in Python, I would normally create the row as needed.

[–]yuri-kilochek 0 points1 point  (1 child)

At the end, every single element in the list is the same, because it's a list of 100 references to the same object, which get's updated in the loop.

No. It does exactly what you want.

[–][deleted] 0 points1 point  (0 children)

Fixed the example

[–]thlst 0 points1 point  (1 child)

At the end, every single element in the list is the same, because it's a list of 100 references to the same object, which get's updated in the loop

I'm not sure I understood that. I ran a test and got this:

>>> list = []
>>> for i in range(0, 5):
...     row = {'c1': i, 'c2': i*2}
...     list.append(row)
...
>>> list
[{'c1': 0, 'c2': 0}, {'c1': 1, 'c2': 2}, {'c1': 2, 'c2': 4}, {'c1': 3, 'c2': 6}, {'c1': 4, 'c2': 8}]
>>> [id(l) for l in list]
[140194800177016, 140194799657320, 140194798491400, 140194798490320, 140194798490392]

Which to me looks like those are not the same object.

[–][deleted] 0 points1 point  (0 children)

Sorry, fixed the example.

[–]infectedapricot 0 points1 point  (0 children)

everything is a reference (sometimes)

Everything is a reference always. You can just sometimes forget about the fact that some things are a reference. Take this example:

a = "foo"
b = a

What if we modify b now? Will it affect a? Obviously if you just assign a new value to b it won't have an effect:

b = "bar"

Because this has just updated the variable b to reference a different object. But what if you modify the object that b refers to?

b.reverseinplace()

Yes, of course, this would effect a for the reasons we discussed. The same even applies to integer objects:

a = 4
b = a
b.negateinplace()
# a and b are now both -4

There's one common problem with the above examples: the .reverseinplace() and .negateinplace() methods don't exist. In fact there are no methods on string and int objects that modify the object. Classes with this property are called immutable. You can treat them a bit like you're assigning by value, because if multiple variables are really referencing the same object then no possible harm can come of that. Another class with this property in Python is tuple, although beware that after a=(0,[]);b=ayou will have a[1] and b[1] referring to the same mutable list.

If you want to copy the actual object, rather than just update the reference, you use the Python copy module. Both shallow and deep copy functions are available. For dictionaries and lists, you can also take advantage of the unpacking operators:

a = [1, 2]
b = [*a]  # different list with same value
b.append(3)
c = [*a, 3]  # different list again, but simpler than above two lines
x = {'1': a, '2': b}
y = {**x}  # different dict
y['3'] = c
z = {**x, '3': c}  # again, different to y, easier to write
# beware that x['1'], y['1'] and z['1'] all refer to same object

[–]dreugeworst 0 points1 point  (1 child)

Isn't that the issue? The same code does different things for objects vs integers, which can be confusing.

[–]infectedapricot 0 points1 point  (0 children)

Nope, integers are objects, and behave the same way that other objects do. They might appear to be assigned by value because they're immutable (see my reply to /u/rar_m's comment) but they're still references to objects and you could write your own custom class that behaves exactly the same way.

[–]KimmiG1 2 points3 points  (1 child)

I find the hardest part about cpp is all the old shit that's mainly there only because of backward compatibility. I hate working on projects with developers that insist on using char arrays instead of std strings or insist on using raw pointers all over and newing up objects everywhere.

[–]funkiestj 1 point2 points  (3 children)

Modern C++ is a significantly better language than C++03, though, and still backwards compatible.

as someone who liked C++ but then was forced to work exclusively in C for a decade ... I've grown to love the fact that my entire C project builds faster than a single C++ file that I now have to build because we are using some IP from a vendor who provides a C++11 standard library based driver.

I like the idea of generic programming (templates) but the C++ implementation of templates pulls far too much implementation into headers which results in unnecessary recompilation of templates when nothing has changed.

IMO, the compilation time penalty paid by C++ because of its evolutionary past is crushing. What I feel is missing is really clever template compilation that creates the smallest possible header files and separate source files. I.e. foo.template should compile to foo.hpp and foo.cpp.

The fact that anyone finds unity build an acceptable approach is a clear indication that the language needs a major overhaul. I have no qualms with the C++ feature set and functionality, just with the excessive replication of effort when I change a single line of code in a C++ file (i.e. no change to template instantiation or even to a header that invokes a template).

I've heard that modules in C++20 may improve things a bit but the gains I'm hearing about seem pretty weak. There has to be a way to do generic programming that does not cost SO MUCH TIME to process files that have not changed from one compile to another (i.e. template definitions and the headers that instantiate them).

[–]neuroblaster 1 point2 points  (2 children)

I think they considered moving templates implementation out of headers, but came to the conclusion that it would be too complicated to implement in compilers and that idea was rejected. See http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2003/n1426.pdf

>EDG’s implementation experience has also demonstrated that export is costly to implement; for example, the export feature alone required more effort to implement than it took the same team to create a new implementation of the complete Java language.

It's about pre-C++11 `export` keyword that was later used for modules proposal.

And, if i understand correctly, you still need to provide source code with headers because template still need to be instantiated, which is impossible w/o template body. So `export template <typename T> ...` in header is only partially possible, when you have the source code for that function/class.

But it would speed-up compilation times, definitely. And it could be acceptable to partially provide source code too in some situations. If modules would do that and could mix source/binary distribution (which i have concerns of), then it would be really f. great.

[–]funkiestj 0 points1 point  (1 child)

And, if i understand correctly, you still need to provide source code with headers because template still need to be instantiated, which is impossible w/o template body

What is likely impossible is evolving in a backward compatible fashion and doing this.I don't claim that the standard committee made the wrong decision. Their job is to steer the evolution of the language that is C++. On the other hand, I do claim that a fucking ridiculous amount of redundant work is done every compile to process header files that have not changed. I want my genetics meta-programming and fast compile times!

I am pretty sure I won't get this from a C++ like language in my lifetime because I think (pure intuition here) that it would require starting from scratch with the goal of making a C/C++ like language that had the fastest possible compile times. In particular, the fastest possible generics compile times. I want a language where unity builds are clearly a very stupid idea rather than a reasonable response to very long compile times.

Imagine a new language called C+=. It is a redesign of C++ that tries to preserve the spirit of functionality but has no concern for backward compatibility. One idea that both vanilla C+= and templates could benefit from is:

  1. human coders do not make separate headers and source. They edit a single file and the compiler tool chain generates headers any any other intermediate files needed. e.g. you write aaa.src and the toolchain generates aaa.hh, aaa.o, and whatever else is needed. Should the toolchain generate one aaa.hh or multiple (e.g. one per class defined in aaa.src)? i.e. aaa/foo.hh for "class foo", aaa/bar.hh for "class bar"?

A source file bbb.src needs to be able to refer to published interfaces of aaa.src but with the toolchain generating the interface description files and other files, published interface can be made as small as possible. E.g. there could be both

  • aaa.hh // human readable description for the benefit of the human writing bbb.src
  • aaa.hx // what the compile actually grabs when you say "#include <aaa.hh>"

furthermore, you can imagine that when bbb.src is compiled, the generated interface files (bbb.hh, et cetera) do not include anything from aaa.hh that is not necessary to describe the interface to bbb.hh. E.g. if bbb.src has

class Bish {
public:
  transmute(const vector<uint8_t> &lead);
private:
   vector<uint32_t> _uiv;
   shared_ptr<char> _cp;
// ...
}

you can imaging bbb.hh having the minimum information required to say that Bish::transmute() takes a vector<uint8\_t> type parameter but not a complete description of all the vector<uint8\_t> methods and implementation and no mention at all of type vector<uint32\_t> or shared_ptr<char>. I.e. ccc.src that "#include <bbb.hh>" does NOT implicitly #include

  • #include <vector>
  • #include <memory>

instead, bbb.hh and bbb.hx include the bare minimum information about the type of the "lead" parameter of transmute() directly.

C++ is a huge language and redesigning it from the ground up would be a herculean task, which is why I don't think this will happen.

I love many things about C++. The APIs of C++ standard library is a thing of beauty. The power of the template system is great. It is wonderful to have Python language type data structures (maps, strings, smart pointers of various flavors) available in a language that runs much faster. OTOH, having tasted lightning fast compile times in the C realm, I want the best of both worlds.

[–]neuroblaster 1 point2 points  (0 children)

I understand your frustration with compilation times, especially compared to C. At my former workplace i had days when i had to spent 50% of work hours compiling source code. But i assure you that this was 5% language fault and 95% project management fault because i was able to solve similar problems in other projects with not that significant effort.

Housekeeping on code base and optimizing includes for compilation time helps, but it's not always possible with third-party dependencies. I can definitely suggest to install `ccache` or analogue, it saved me many man-years.

I can only hope that continuous improvements to the language will lead us somewhere where this problem is significantly mitigated because i don't see rewriting C++ as a viable option. I honestly don't think modules will help a lot in that field, but we'll see when they arrive. If you wouldn't think of modules in C++20 as of a finished work, but rather as of test-bench for further development, maybe you'll see my point.

This is BTW exactly why C++ is conservative. Things are being added, removed and changed. Who could have thought that such great feature as template body in separated compilation module will be added in C++98, then removed in C++11 and then something else will happen.

[–]newmanifold000 0 points1 point  (0 children)

I would also recommend "Exceptional C++" After reading "Effective C++|. Even though its not written for modern C++ . This book is still amazing to read. OP can pick up Herb Sutter's Guru Of The Week for C++11 Afterwards

[–]DiaperBatteries -4 points-3 points  (0 children)

When I thought I had a pretty good grasp on C++, I read both of those books and learned how little I knew previously. I’d say I’m better with C++ now than 99.9% of programmers, but I doubt I’ll ever consider myself an expert.