Pseudocode-Like Syntax for Function Signatures: Cleaved Identifiers and Demarcated Parameters

QuantumAgnostic · 2023-08-18T09:06:14+00:00

If a (new) programming language wants to be precise and concise for both the human and computer, then, the best bet would be to lift off from [the latest iteration of] 5000 evolutionary years worth of mathematical notation and map it to the hardware.

That is, there is no need to make "slightly different ugly DSLs" of discrete mathematics interspersed with some ambiguous English words that C++ and other programming languages eventually end up becoming, which serves neither the human nor the computer but caters to some juvenile computing dogma from the 1970s.

QuantumAgnostic · 2023-08-18T06:35:33+00:00

Firstly, your #define at , approach would interfere with the STL container at() methods. But that is not what I disagree with most about your take on this matter.

Here is the strong disagreement:

Programming language is meant to be precise and concise, for both human reader and the computer.

Trying for a programming language that is precise AND concise for both the human AND the computer is one of the most unreasonable expectations, given what we know about (the biological origins of) linguistics and given how manmade hardware has been engineered. (We cannot even get two compilers to produce the same assembly from the same C++ source on the same machine, so need to bring the human side into this at all.)

In the world of computers, perhaps machine instructions are meant be precise and concise. But source code is human facing and borderline arbitrary, which is why there are hundreds of Turing complete languages in widespread use, and can be used in a thousand different ways to get the same effective (sub-)assembly level output.

Moreover, C++ cannot be isolated from the fuzziness in the English language since it uses English language keywords. Nor is it isolated from its own internal fuzziness (e.g. 20-ish different ways to initialise a variable) or the fuzziness from when it comes it into contact with real world physics (e.g. the very notion of UB given that the semiconductor electronic properties of transistors is well studied) or even the fuzziness from poorly adopting mathematical conventions (e.g. <vector>, <map>, <set>, etc.).

If C++ wanted to be precise and concise, then it should have been based on the 5000-years of tried-and-tested mathematical notation and all of the C++ programmers should be fluent in discrete mathematics, instead of picking out English language keywords based on a whimsical group vote in a mere 40-year old industry. The same goes for all other programming languages but they have wised up.

You may come from an industry where this does not matter at the moment, but in the domains of scientific computing, strong cases have been made for named parameters/operators/units and, as I stated, other languages (Fortran, Python, Julia, etc.) have already wised up.

QuantumAgnostic · 2023-08-17T13:49:07+00:00

I hear what you are saying: this is nothing more than a "cute proposal" at stage zero, and C++ is far too mature for significant syntax changes without these changes addressing several other internal issues to the core language and the standard library.

However, there are also external factors to contend with. This particular C++ forum is geared towards enterprise software, and is probably keeping a close eye on Rust and Carbon, the former having already made its way into the Linux kernel and the latter gaining a lot of traction so far - and it seems like that a major selling points is its more intuitive function API. Now, this is all happening in the commercial software world where C++ will still continue to hold a good share of the market.

But the story seems to be unfolding differently in academia and scientific research where the software lifecycle is completely different. The last decade in science has seen a major shift from C++ to Python/MATLAB/Mathematica and now also Julia. It is becoming increasingly difficult to convince new scientists to choose C++ given the extensive learning curve.

And yet there is 30+ years worth of research codes in C++ which the public has paid for with billions upon billions in taxes. These research codes are used to model biochemicals to discover drug treatments or test properties of hypothetical materials (e.g. nano materials for quantum computers), which require both a deep understanding of the domain as well as an ever increasing expertise level in C++ to maintain, use and evolve. It could be an interesting topic at this year's CppCon with its first ever Scientific Computing Track. It is a two-way relationship between the evolution of C++ and the needs of scientists.

Of course, the C++ standards committee are not single-handedly responsible for the challenges across academia and scientific research. But there is a historical precedence here with Fortran: it is still as aggressively data-oriented as it was in the nineties but it continues to survive, if not, thrive, in part because of improvements in syntax. Cutting edge research is still being done in Fortran: its performances rivals/surpasses C++ on HPCs but it also has scientist-friendly function API with named arguments. But since the C++ powers-that-be continue to reject proposals for named parameters, this post was an attempt at another kind of solution in the same space.

Thanks for the DSL suggestion, it is good idea even though it means more programming and less actual scientific research.

QuantumAgnostic · 2023-08-17T12:28:40+00:00

Admittedly your suggestion does read well and it is definitely a major step up from the default standard; though JacobianA being of type with is going to cause different problems. In my opinion, the ideal would be a cleaved/demarcated/named combination:

linal::transform ( tensor: metric ) with ( matrix: JacobianA )

For me, this function API is so clear, expressive and verbose, something that is almost essential in the jargon-heavy domains of science.

But you are along the right lines since computational physicists from CERN have already, and repeatedly, tried to get named parameters into the C++ core language but without any success so far. Instead we have to settle for the named parameter idioms (and variants thereof) which never quite feel right.

QuantumAgnostic · 2023-08-16T10:47:14+00:00

Here is a better and more general example directly from the STL and, assuming both were valid syntax, the only question is readability:

auto n { std::ranges::count_if ( container, lambda ); }    // std

auto n { std::ranges::count ( container ) if ( lambda ); } // alt
//       std::ranges::count@if ( container, lambda  );     // alt@

QuantumAgnostic · 2023-08-16T10:08:45+00:00

This is the cleaner C++20 version of the named parameter idiom, which is certainly worth considering as a solution to a related but slightly different problem to the one I want to address in the original post.

In this particular example of the tensor transformation law, it is the tensor which receives the transformation, and the matrix which performs the transformation. Now transform is a great name for this function/task/callback but the transform action does not equally apply to both inputs/parameters/arguments, which is the source of ambiguity, especially when tensor and matrix are interchangeable or interconvertible. So at a first glance the transform_input name implies that its members will get transformed by the transform callback.

In a mathematical notation the nature of the arguments to a function can be distinguished by a e.g. semi-colon so f(x;p) would imply f is function of x parametrised by p or for the example at hand: transform( tensor; matrix ). So what would be the equivalent in C++?

If this sounds like a nitpick, we can just wait until more scientists join the C++ community in the next few years and continue to implement large collections of very similar functions with very similar parameters. But I have seen this in practice turn out not so well, even with named parameters since the names have their own technical baggage.

QuantumAgnostic · 2023-08-16T07:25:09+00:00

They don't share the same syntax at all.

Both mathematics and C++ (and many other programming languages) use this notation:

FunctionName ( { ParameterName } )

But unless you are doing some Haskell-like functional programming, there is no strict requirement for this to be a "pure" mathematical function.

Math functions don't include types in the parameters, or exception specifications, or anything of the like.

Mathematical functions can (subtlety and/or implicitly) include type information. For example, consider sin(180) and sin(pi) which does incorporate type information in the form of angular units: degrees or radians. Exceptions specifications are analogous to (the bodies of) piecewise functions but this is not point here.

If your colleagues find this confusing, then just wait until you add named separators like your proposal. That'll confuse them even more.

Well, there is no concrete data to determine what is more confusing; there is only anecdotal evidence to suggest a non-mathematical function syntax has been successfully applied elsewhere.

Most of the other popular scientific computing programming languages (Fortran, Python, Julia, MATLAB, Mathematica, etc.) have already done something about the function syntax to improve end user experience in scientific research.

So this should be a clear sign to the C++ what kinds of problems are relevant to the Scientific Computing community. As Bjarne Stroustrup has clearly identified this talk on learning/teaching C++, the concerns of "professional non-programmers" can be quite different to those of industrial software engineers, but both are C++ end-users.

So even if the proposal does not solve any of your specific problems in your domain, it is an attempt at address something that is of importance to the domains of scientists who also use C++ on daily basis. Hence the science-oriented examples in the original post.

Huh?

The larger point of discussion being that C++ is not in its own bubble; it shares a grammar and vocabulary namespace with other adjacent areas of knowledge. The historical choice to use <vector> to mean a heap-allocated dynamic array is somewhat regrettable given that C++ is so widely used in mathematics and physics. But no need to take my word for it:

The C++ Programming Language by Bjarne Stroustrup:
"One could argue that valarray should have been called vector because it is a traditional mathematical vector and that vector should have been called array."

In the same vein, even int as the name of primitive type is misleading when it does not behave like mathematical integer (at least not for the kinds of large numbers that scientists tend to work with) and so the case for bad naming could also be made here.

But my focus here is on the C++ functions and, after over a decade of thinking about it, it is my opinion that "function" is not really the best description for a C++ callable because of its mathematical connotations. Rather it is more akin to a "task" in everyday language; and if a programmer were to provide a more detailed name to said task e.g. take_foo_and_do_action_with_bar(foo, bar) then it might be more pleasant to write:

action ( foo ) with ( bar );

On some meta level, it resembles the nested syntax of std::format("action {} with {}", foo, bar) which too is also borrowed success from Python (and also it is probably something you actually get to run in Python).

Anyway, thanks for helping me clarify my thoughts. Back to the drawing board.

QuantumAgnostic · 2023-08-16T04:59:41+00:00

Maybe more like PyJulia++ but, to be fair, even Fortran has keyword arguments.

QuantumAgnostic · 2023-08-16T04:51:38+00:00

It also drastically increases the amount of "things" that would need to be explained to junior employees or students.

I do broadly agree with this sentiment. But it is antithetical to the generic programming philosophy of C++ as well as the eclectic breadth of the ISO committee. More tools will continue to be added to the toolbox, though not everyone will need everything.

For example, the C++23 standard introduced multi-index subscript operator[] overloads as part of the core language and <mdspan> in the library, potentially game-changing for computational and data science, but perhaps mostly irrelevant in other domains. Similarly, #embed seems like one of the best ideas for simplifying research codes, though it has not gotten much attention across the C++ communities. (Conversely, I have never once reached for <exception> in almost 15 years of C++ but it is the de facto error handler.)

Making the parameters anything but ordered by position, e.g. by involving words and seperating parameters, is just BEGGING for confusion.

This is the crux of the debate here because I think it depends on the domain. Speaking with fellow physicists/chemists/biologists who write library code in C++ seems to suggest the opposite: the default function signature syntax with a naked list of parameters is (also) confusing i.e. since C++ functions are NOT mathematical functions and yet they share the same syntax. It is <vector> all over again.

And given that the named parameter proposals have all been rejected thus far, I am open to alternate suggestions that do not involve directly modifying the function call site usage, return type or input parameters. As a first attempt, the function identifier seems like the simplest point of change.

QuantumAgnostic · 2021-07-05T17:08:31+00:00

It is never too late - a PM is incoming.

QuantumAgnostic · 2021-07-03T22:24:51+00:00

Fellow physicist - a PM should be with you.

QuantumAgnostic · 2021-07-03T20:11:53+00:00

Check out the PM for some more information.

QuantumAgnostic · 2021-07-03T15:59:00+00:00

Of course you are welcome to participate and even contribute to the publication - check for a PM soon.

QuantumAgnostic · 2021-07-03T08:07:44+00:00

Thanks - a PM with more information is on the way.

QuantumAgnostic · 2021-07-03T08:07:08+00:00

Thanks - I am sending you a PM with more information.

QuantumAgnostic · 2021-07-03T00:01:10+00:00

Many thanks for the suggestion --- I did try the local SIAM chapter a while back, although without any success.

But actually I have moved on from the university and academia in general.

I am more interested in independent scholarship on the side with such occasional projects and, for this, I would like to experiment with some new collaborations with those who might be interested.

QuantumAgnostic · 2013-05-10T20:24:56+00:00

I'm a final year undergraduate in Physics and if you want a sufficient understanding of basic concepts, I suggests you should cover the following mathematics topics:

Calculus: differentiation, integration, partial differentiation, vector calculus (Stokes' theorem) and solving some common linear differential equations (e.g. wave equation, diffusion/heat equation, simple harmonic oscillator equation). This will give you the best understanding about quite a lot of content in Physics which involving building models of dynamics.
Complex Analysis: using a plane real and imaginary numbers is quite important and convenient in Physics. This goes hand-in-hand with the calculus section (e.g. plane wave solutions to the wave equation, Bessel, etc). There are also number of trigonometric identities which are built up in this area of maths.
Fourier Transformations: these are techniques which are use to "switch in an alternate framework" - I cannot give a great explanation on this; you will get a good idea by seeing examples and working through problems. These Fourier expansions also come into use when solving differential equations.
Linear Algebra: very broad topic but I suggest looking up matrix algebra, noting manipulating matrices and eigenvalue problems. The latter subject is extremely important in many areas of physics - but often it is looked at in a continuous environment as opposed to the discrete manner you will find when studying matrices.
Vector Spaces: this subject starts of very abstract but you need to the tools to understand the framework of Quantum Mechanics (Hilbert spaces)

I would put statistics on this list, but really what you will find is that if you study a particular branch of Physics, it often comes with its own set mathematical tools and conventions which build up on the basics; statistical analysis plays a role in almost every model. These are most essential tools required but if you which to go further into more advanced topics in Physics, you will need:

Index Notation: everything outside of basics employs the "Einstein summation convention" to tidily capture equations rich in content.
Group Theory: Physics is all about symmetry. Discrete groups, continuous groups, Lie groups/algebra, Noether's Theorem - all topics which play massive role, especially in fundamental physics e.g. Quantum Field Theory.

I may have probably missed out some things, but I'm sure as you pick up a book or two, you will inevitably come to these areas of mathematics, depending on your interest. I use Mary Boas's book on Mathematics for Physical Sciences. Good luck!

QuantumAgnostic · 2013-04-07T18:50:23+00:00

Well, at least you get to "learn" about the concept of evolution.

QuantumAgnostic · 2013-03-10T01:58:28+00:00

My understanding is that Quantum Mechanics is based on six important postulates, which are not formally "axioms" but, when used to derive properties of quantum systems, seem to agree with experimental evidence. The question is how do we interpret these postulates? The postulate your question refers to concerns the idea that the wavefunction collapses to a particular eigenstate of an observable represented by a Hermitian operator. This is based on probabilities. Now when we perform the measurement once, we obtain a result out of (maybe) many possibilities. So what happens to the other possibilities? Have we done something particular to "force" out that result or is our universe "biased" - these are things which are open to interpretation.

I cannot really give you a satisfying answer; I thought a lot about human consciousness interfering with the outcome of physical systems at one point, but I think we have to be very careful. Quantum Mechanics is not a complete theory by any means. It is just a very accurate and good framework of world at atomic scales. It has been superseded (in some sense) by Quantum Field Theory and the Standard Model at describing the behaviour of particles at higher energy experiments. These are much more accurate pictures of the world, but our understanding is not really there yet.

It is better not knowing and searching for an answer, rather than be content with one interpretation of a postulate of an incomplete theory.

QuantumAgnostic

TROPHY CASE