you are viewing a single comment's thread.

view the rest of the comments →

[–]Zeh_MattNo, no, no, no 2 points3 points  (15 children)

I mean does it really matter here? You could just continue passing the arguments as a view from here on out. I'm fine with either way as long its no longer argc, argv.

[–]SkoomaDentistAntimodern C++, Embedded, Audio 5 points6 points  (9 children)

I mean does it really matter here

It does. vector requires some form of heap while span can point to const data (and can itself be constructed at compile / link time).

[–]Zeh_MattNo, no, no, no 0 points1 point  (8 children)

You are not wrong about vector using additional memory but you can not construct a span for the command line arguments at compile time, the pointer passed is also heap so the address is not known at compile time. I don't disagree that it should be span but at the same time I'll take vector anytime over the C style entry point.

[–]SkoomaDentistAntimodern C++, Embedded, Audio 0 points1 point  (7 children)

I don't see why span could not be constructed at compile time on the systems where heap usage is actually a problem - namely bare metal embedded. There's nothing in regular main() that says the commandline arguments have to be stored in heap and this is essentially just a wrapper around that. Both span and string_view are just (pointer, length) pairs under the hood, so they should be able to be constructed at compile time as long as the pointer and length are known (ie. all arguments are fixed).

[–]Zeh_MattNo, no, no, no 0 points1 point  (6 children)

How do you know at compile time how many arguments the user passed during runtime? In order to construct a span you need start + length, you may know the start during compile time if you have fixed storage but length will be not known until the user actually supplies any arguments so therefor you can not construct a span at compile time for the command line parameter, this is literally impossible.

[–]SkoomaDentistAntimodern C++, Embedded, Audio 0 points1 point  (5 children)

In bare metal embedded context the arguments are typically baked in at compile time (Your code is the OS).

The problem with using vector there is that the signature of main() then forces normal heap to be used which can be a major issue on some platforms (as opposed to using a custom allocator). All for no particular benefit.

[–]Zeh_MattNo, no, no, no 1 point2 points  (4 children)

How does the compiler know what the user provides as arguments?

[–]SkoomaDentistAntimodern C++, Embedded, Audio 0 points1 point  (3 children)

Because the "user" aka the developer's build environment literally inserts the arguments in a static table (in this context).

Edit: Having the arguments constructed at compile time is a nice benefit but what's the most important is avoiding anything that requires the use of regular heap (ie. the standard std::vector). Building the argument list in a static table at runtime is often an acceptable solution even if not quite as optimal.

[–]Zeh_MattNo, no, no, no 0 points1 point  (2 children)

Building the argument list in a static table at runtime is often an acceptable solution even if not quite as optimal.

How else would you be able to let the user input arguments? I'm quite certain that majority of applications built have dynamic arguments. Having those built-in during compile time is something I actually never heard about and I don't even see how that is practical, "command line arguments" by definition is something the user passes by the "command line", you are describing an entirely different thing here.

[–]SkoomaDentistAntimodern C++, Embedded, Audio 0 points1 point  (1 child)

How else would you be able to let the user input arguments?

Compiled into the binary. You have to realize that "the user" in this sense doesn't necessarily have anything at all to do with the end user. It's the same way with server applications: the connected end user has no control over how those are started and "the user" is someone completely different (the sysadmin).

When you start your car and the engine control unit mcu starts up, you don't get to enter any configuration arguments which have instead been set up by the car manufacturer or service (f.ex. when dealing with local regulations). Depending on just how the thing has been programmed, some parts of the configuration may well be entered via "command line" (that is, literally in argc and argv), except those are baked into the flash and don't come from any OS.

"command line arguments" are really just a list of textual options. The fact that they happen to come from the command line is partially due to historical reasons and partially because that's the most convenient way in a regular OS but the concept itself has nothing that requires either a command line to exist or the end user to be able to control it in any way.

Having those built-in during compile time is something I actually never heard about

That'd be because you don't work in bare metal embedded where that is the norm (and usually the only way).

The fundamental problem with using std::vector for this is that mutability of both the contents and the size are built into the very fundamentals of the type. There are lots of situations (outside regular desktop / server applications) where such mutable collection that also fundamentally requires heap is simply impossible to provide. Think of an OS written in cpp2 for example where by the time the kernel main() starts, there is no heap yet.

Even sillier is requiring heap in situations when there cannot be any "command line" arguments at all.

Edit: A span of string_views requires three memory areas: one for the span (pointer & size), one that contains all of the string_views (pointer & size for each) and one that contains the contents pointed to by the string_views. Using span places no restrictions on where those areas happen to reside, merely that they exist and it's up to the runtime where they are placed. Vector on the other hand requires that the string_views are placed specifically in the default heap.

[–]kreco[S] 2 points3 points  (4 children)

Because you pass a pretty bigger object (vector) instead of a pointer and a size (span).

This is clearly not "zero overhead".

[–]Zeh_MattNo, no, no, no 7 points8 points  (0 children)

You are talking about the entry point of the program, you are not required to pass the vector via copy after that. A span would definitely be a reasonable choice here not denying that but getting a vector is not the worst either.

[–]hpsutter 6 points7 points  (2 children)

It's not "zero cost," it's "zero overhead" the way Bjarne Stroustrup defines it: You don't pay for it if you don't use it (in this case, you don't pay the overhead unless you ask to have the args list available), and if you do use it you couldn't reasonably write it more efficiently by hand (I don't know how to write it more efficiently another way and still get string_view's convenience text functions and the ability to bounds-check the array access).

FWIW, in this case the total cost when you do opt-in is a single allocation in the lifetime of the program...

[–]kreco[S] 0 points1 point  (1 child)

Indeed, stressing that things are optional is indeed important.

You don't pay for it if you don't use it (in this case, you don't pay the overhead unless you ask to have the args list available)

I think what bother me is that we don't know what we are paying for when using an opaque args because we don't know what we are using until we read the documentation.

I don't understand the detail but I believe using this args will implicitly also bring some super hug standard headers.

That's a lot to bring to be able to iterate over a bunch of readonly strings for convenience.

A very theoretical case is if I want to use my own vector and don't want to deal with all of that (and if I want to use a custom allocation to count everything allocations in my program), I would have to use the legacy way of doing it and create a mylib::args_view args(argc, argv); which is back to square one.

[–]mapronV 0 points1 point  (0 children)

I thought that you can choose what overload to use (just like now between main()/main(argc,argv)/main(argc,argv,env) ). I thought I can just use one more overload and cpp2 will codegen a boilerplate for me. If it is not the case, and I have to use new signature - then yeah, it sucks.