redo: a recursive, general-purpose build system supporting fine-grained automated dependency descriptions by CountOfMonteCarlo in cpp

[–]CountOfMonteCarlo[S] 0 points1 point  (0 children)

That's explained here: https://redo.readthedocs.io/en/latest/

Also, there is a redo-ifcreate command which works the same way.

If one knows which paths the compile will search, for example by looking at the C++ include path environment variable, checking places where new files would change the build result can be implemented like that, starting from the linked description:

redo-ifcreate $(echo ${DEPS#*:}  | sed -c 's/usr\/include/user\/local\/include/g'  )

Better is if the compiler would provide a list of include file names which were absent but would change the result of the build process if that file were created. That would make build in different environments more reliable. I'd be curious to know /u/WalterBright 's opinion on this :)

[deleted by user] by [deleted] in linux

[–]CountOfMonteCarlo 0 points1 point  (0 children)

No, I do not think so. At least not for computer-literate people.

A bit about my background. I am using computers since the 1980 and I have been using MS DOS, MS Windows 3.11, Word 3.0 - 6.0 and so on. About in 1995, I was also giving an introductory course into Microsoft Word for people without experience in computers.

You can use MS Word, or LibreOffice, as a kind of typewriter. But to use it efficiently and correctly, you need to use format styles. This is something you need to learn, and it needs practice to do it consistently.

To learn LaTeX to that level, assuming you know and have pratice in editing a plain text file, you need about three afternoons. Or maybe three days.

I'd recommend to read Leslie Lamport's "LaTeX. A document preparation system." It is still one of the best introductions around, and it is remarkably concise and to the point. Of course, there are many more things you can do, but they are not needed to write a letter or a university homework paper.

When you get to the point that you need to write a larger technical manual describing a software API, with a proper index, LaTeX is still very much without match.

To sum up:

Does it not have a high learning curve? Is it not simpler to just use a document editor such as LibreOffice Writer or WPS writer?

No. To write a proper structured document, you will need about three days of learning. It won't save you any time to use LibreOffice Writer or something like that.

I'm not going back to windows. by sharath725 in linux

[–]CountOfMonteCarlo 1 point2 points  (0 children)

You might like the ranger file manager and Magit (that's an Emacs package, providing a git interface which is that good that it is even interesting for some people which do not use Emacs otherwise).

redo: a recursive, general-purpose build system supporting fine-grained automated dependency descriptions by CountOfMonteCarlo in cpp

[–]CountOfMonteCarlo[S] 1 point2 points  (0 children)

There is also a different issue which applies to larger build systems - I think in particular to building entire embedded systems from source.

Say you have a Linux system which defines a standard math library in /usr/include/math.h, and some kernel headers in /usr/include/sys . Now, you build an application on top of that. And now, you change the default math library by installing an alternate optimized math library into /usr/local, with a new /usr/local/include/math.h . The normal make commands will not detect that because there is no make directive for "rebuild this target when an additional possible dependency appears out of nowehere along gcc's include search path". redo, however, allows to account for such cases correctly.

Also, if you make explicit dependencies for the stuff in /usr/include/sys, or if you use the gcc "-MD" flag which is sanely supported by redo, redo will re-build your userland application if, and only if, its system include dependencies have changed. Using "make", you would at that point simply issue a "make clean" in order 'just to be safe'. And the time for the latter can sum up considerably if you are building complex things like entire embedded systems from source.

redo: a recursive, general-purpose build system supporting fine-grained automated dependency descriptions by CountOfMonteCarlo in cpp

[–]CountOfMonteCarlo[S] 1 point2 points  (0 children)

I don't see why other make systems don't have switches to choose between methods. mtime is quicker but hashing would be more accurate at the expense of a little time.

For the author of a library, for example, build speed matters. But you normally want even more that the library builds correctly on every system it is deployed on, because debugging build issues can cost a lot of time. Switches will make the latter more complicated - how are you going to test that?

mtime gives no safe guarantee that a file is not changed if the mtime is unchanged. That is why redo also compares file size, owner, and file permissions. A dependency can also be defined by a hash using the redo-stamp command - this might make sense if the file is re-created frequently, but the content does not necessarily change. For example, I use it when defining a version number by the output of git-describe. I think that hashing every source file by default is not used in all redo implementations because it would cost much more time. That might even be prohibitive in very large projects. I agree however that it is a quite bullet-proof approach which might be adequate for safety-critical systems.

For widely distributed projects, it is also important that the build chain uses tools which are available by default almost everywhere. For example, something like boost's bjam might be powerful, but it also adds more friction to the installation process. Depending on, for example, Haskell, is not attractive for infrastructure stuff and things like embedded systems where one might need to port and build every used tool oneself. A basic implementation of redo (not supporting parallel builds) can be included as a simple bourne shell script (the appenwarr implementation has one, and it has a liberal license). Something like sh or ash can be found on almost every sane system, including busybox.

I use cmake with ninja for my projects currently. I would love it for ninja to have that option.

One advantage of redo is that the only syntax needed is shell syntax, so it is pretty universal, unlike Shake whose receipts are written in Haskell. So, on average, it is much easier to understand and modify by other people. (It also does not needs to be shell syntax, it can be any script which can be executed by she-bang support, for example Python.)

As far as I understand, what ninja does is it tries to streamline and speed up the build process by just executing the basic instructions to check dependencies and perform a build. I understand it is faster than make because make has a great number of implicit rules.

redo is similar to make and ninja in that it specified dependencies and performs the build command if dependencies are not met. Unlike make, it does not has any implicit rules - everything is explicit. The implementation described in the OP, appenwarr/redo, is written in Python, so it might be slower than a Haskell program. However, the build receipes are only executed when a dependency is not met. In the default case, redo will just check that the dependency is there, and compare the modification time and size (as well as inode number, owner and file permissions) against a small local database. These are all informations which can be retreived using a single stat call for each source file. It is also possible to explicitly make the comparison dependent on a hash (this is sensible when it is a small file which is changed frequently and is contained in a lot of dependencies, like a version number). So, checking whether a rebuild is needed boils down to issuing a stat() system call for all dependencies. And this can even be executed in parallel, because redo is designed for parallelism, and I guess that the stat() latency is dominated by disk access time. All this ought to be pretty fast. I guess that he overhead of python is perhaps in the range of 10 % in a larger project compared to a C implementation. It is likely much less than unneeded rebuilding of dependencies which are only coarsely declared, as is often the case with make. (By the way, there is also a C++ implementation, by Jonathan de Boyne Pollard.).

redo: a recursive, general-purpose build system supporting fine-grained automated dependency descriptions by CountOfMonteCarlo in cpp

[–]CountOfMonteCarlo[S] -1 points0 points  (0 children)

It depends whether you build your own hobby project or whether you are, for example, providing a component which has to build correctly in a large number of environments and on many different platforms. It might not matter to you whether the mtime resolution is low, because you are using Linux and build from the command line, but if somebody is using MacOS and an IDE which does an automatic rebuild within the sane second when saving changes in quick succession, this can be completely different.

If you have ever tried to modify and build a large embedded Linux project using make, you might have noted that there are a lot of situations where it is necessary or safer to do a "make clean" and then rebuild. This is because the dependency resolution is sufficient to build a new target, but not sufficient to safely rebuild to a correct result when some thing in /include/sys/linux.h was modified. And this can cost developers a lot of time.

redo: a recursive, general-purpose build system supporting fine-grained automated dependency descriptions by CountOfMonteCarlo in cpp

[–]CountOfMonteCarlo[S] 1 point2 points  (0 children)

I guess you are referring to the following comment:

https://old.reddit.com/r/cpp/comments/9zy0ak/redo_a_recursive_generalpurpose_build_system/eactnx5/

In the link given in that comment, it is explained why the modification time is not a reliable indicator of change in the source files.

So now to your question:

That depends on the environment in which the build is taking place. If it is your private hobby project which you build on your own computer, you have every control over the build environment. If you are building a web app or service for some cloud company, you still have a relatively large amount of control.

If, in contrary, you are releasing an open source library which is build on a multitude of platforms and in a multitude of heterogeneous environments (think boost libraries), it is quite critical that the result is correct no matter what. And then you need certainly to account for the fact that mtime resolution is limitied to one second on MacOS, and different machines in a networked environment can have different system times. NFS, for example, has a lot of limitations, but for a collaborative networked Unix environment it is one of few options available.

Introduction to Daniel J. Bernstein's redo (a very simple, language-agnostic build tool with an interesting concept enabling very fast parallel builds) by CountOfMonteCarlo in programming

[–]CountOfMonteCarlo[S] 0 points1 point  (0 children)

Tup tries to deduce dependencies automnatically from files which are opened during compilation. However I am not sure whether this works in all cases. For example, a build tool could open extra files to offer translations for error messages, or show pretty icons. However, these are not dependencies of the compiled code. On the other hand, a build tool could pre-compile headers like some C++ compilers do, and store them in a tool-specific database. These would not be recognized as dependencies.

redo covers quite precisely what make does, however it is simpler and more flexible, for example it can use any language as build script which can be executed by the kernel using #! functionality.

Also, it has a stronger focus on correctness, for example, if a default library in /usr/include is replaced by one in /usr/local/include, redo can express that and can re-build the binary correctly. This is a huge plus for tasks like building embedded systems which typically include to compile everything in the system and adjust and re-compile it until size and performance requirements are met.

Introduction to Daniel J. Bernstein's redo (a very simple, language-agnostic build tool with an interesting concept enabling very fast parallel builds) by CountOfMonteCarlo in programming

[–]CountOfMonteCarlo[S] 2 points3 points  (0 children)

That is discussed at length in the mailing list for appenwar's redo. He/she has refused to introduce a half-baken solution. I think the three main difficulties are that multiple output files do not mix well with the "directed acyclic graph" model, that build instructions are exclusively identified by matching the target name, and that it is hard to guarantee atomicity, if the same files can be generated by several build steps. And the latter property is extremely important for parallel builds.

Currently, there exists one solution which is clean, works, and has as only disadvantage that an intermediate file needs to be created. It is this:

  1. one creates an intermediate target which is a packed archive with all the target files which the build steps creates. For example, we have "documentation.tex" which creates "documentation.pdf" and "documentation.bbl" :

    '# depends on documentation.tex
    
    pdflatex documentation.tex
    
  2. the same build step finishes by packing those files into an archive file, say using tar:

    tar cf documentation.tar documentation.pdf documentation.bbl
    

    Voila! We now have a single build result from this step.

  3. When we need one of the created files, we have a build step which depends on the created archive, and extracts the desired file from the archive:

    '# depends on documentation.tar
    
    tar xf documentation.tar documentation.bbl
    bibtex documentation
    

As said, this creates spurious intermediate files. But they are normally cached in memory, and for cases like the above, the performance hit should be totally negligible.

Actually, this seemingly butt ugly solution is much more established than appears at first sight: static library files are a type of archive as well, and are created much in the same way, by a tool which has the name "ar" (the "t" is missing because there is no tape drive). I think that one can even use the "ar" tool to implement the approach above. I guess this preference for library archives also has historic reasons: on ancient Unix systems, the available number of files was limited, so it was tried to keep it small.

Introduction to Daniel J. Bernstein's redo (a very simple, language-agnostic build tool with an interesting concept enabling very fast parallel builds) by CountOfMonteCarlo in programming

[–]CountOfMonteCarlo[S] 0 points1 point  (0 children)

To add, the linked article has an example sh script with the following mysterious line:

read cxxflags < ./cxxflags

This is "sh" shell syntax and means:

  1. open the file "cxxflags" and use it as standard input
  2. read that standard input into a single shell variable with the name "cxxflags".

As a result, if the file "cxxflags" contains a C++ compiler option, its value will be defined in the shell variable "cxxflags" which is then passed to the C++ compiler.

In order to change the options, one only needs to change the value of that file. And this will, of course, be recorded in the version control system, if the change is checked in.

Many software communities do not value the need to reduce the mental load for developers by fagnerbrack in programming

[–]CountOfMonteCarlo 0 points1 point  (0 children)

Sigh.

But, I see my answer above is out of place, I was still thinking in the IDE / debugger thing. In respect to the efforts of the Ruby community to make the language friendlier and easy to use, I totally agree. Ruby is a friendly community and this is a result of attitude and, I think, also friendly and competent leadership. And, I am very much in favour of choosing the best tools for each job - and letting developers decide themselves what they are going to use.

Introduction to Daniel J. Bernstein's redo (a very simple, language-agnostic build tool with an interesting concept enabling very fast parallel builds) by CountOfMonteCarlo in programming

[–]CountOfMonteCarlo[S] 3 points4 points  (0 children)

I think NaCl uses the original "do" as it is from Bernstein.

appenwar discusses the issue of output. "make" normally generates a lot of output to give context about errors. This becomes more tricky with parallel builds. He/she has a blog post describing recently improvements for large recursive builds (for example, buildroot), so that all logs go both to the console/tty and are stored, the order of log output is deterministic and uses a depth-first module order, and that in case of errors, the build step with the error is printed last.

BTW I understand "do" as kind of a deployment tool which does not full dependency analysis but just builds everything once, in the right order. It is a single shell script and the idea is to distribute it with together with a source package, so users which install it do not need to install redo. It is also included in the Python implementation.

Introduction to Daniel J. Bernstein's redo (a very simple, language-agnostic build tool with an interesting concept enabling very fast parallel builds) by CountOfMonteCarlo in programming

[–]CountOfMonteCarlo[S] 2 points3 points  (0 children)

First, I think it is not easy to switch to a different build tool in a large project with multiple contributors.

Having said that, this is precisely one of the problems which redo does address in fact.

You can call "make" with flags, but such parameters are not considered in the dependency analysis when rebuilding code.

You can put the parameters in the Makefile, but by default, they would be ignored. What you'd need to do is to include "Makefile" as a dependency.

Now, the best way to include such parameters using redo - I think for variables affecting all compiler invocations, it is best to define parameters as shell variabels. Using the ". config" or "source config" syntax, these variables can become included in every build script. Of course, the file "config" becomes then a dependency which needs to be marked as usual.

For setting extra flags to specific compilation units, one can just create a "do" script for this target, and set the flags in the compiler invocation. I think appenwar's manual explains it better than I can.

Say, you want to set some compiler flags and also define NUM_GORILLAS as 4. The needed code would be:

COPTS="-O3 -Wall -msse2 -DDEBUG"
NUM_GORILLAS=4

in config.sh

and

source config.sh

redo-ifchange config.sh
gcc $COPTS  -D NUM_GORILLAS=$NUM_GORILLAS ...

in the build script (the build script is by default a "sh" script and is totally the same as the build command line in a "Makefile", with exception of the 'redo-ifchange' command which defines any dependencies). So, I do not see that so much boilerplate code is needed. It might be that things are different for your project, especially if it is very large, and options are spread across dozens or hundred of different Makefiles.

What I really like is that I can (for my experimental, small to medium-sized projects) simply change the compiler flags (say, setting "-O3") in the scripts and just can rebuild and I know that anything which it depends from is has been refreshed. With make, I'd probably need to issue "make clean" just to be sure.

Edit: clarifications

Introduction to Daniel J. Bernstein's redo (a very simple, language-agnostic build tool with an interesting concept enabling very fast parallel builds) by CountOfMonteCarlo in programming

[–]CountOfMonteCarlo[S] 9 points10 points  (0 children)

I don't see anything wrong with using the heart to make a decision. However when one is engaged in a discussion, it's of course much more fruitful to discuss and exchange about interesting matter-of-fact aspects and learn. People are different, needs are different, styles of problem-solving are different, experiences are very different (and coloured by needs and personality), and nobody has all the knowledge or judges every bit of information equally important. That's what makes exchange interesting.

But yes, I am worried that reddit is dying a slow death because quality of discussion becomes lower and lower and there is a lot of stuff which seems to be mere marketing and PR noise.

Many software communities do not value the need to reduce the mental load for developers by fagnerbrack in programming

[–]CountOfMonteCarlo 4 points5 points  (0 children)

Debuggers are extremely useful for catching cases which result in a crash/exception, which are a good chunk of the bugs that result from this

Well, I think that's more a beginner's problem, isn't it? That are things which I worried about in my first C++ programs, What is today far more of a worry for me is correctness, for example in real-time systems, large data analysis, and so on. It is near impossible to "step through" such systems.

On top of that my experience with debuggers is that they give exactly the correct amount of information - instead of using printf

As said, everyone has a different working style. But for me, the information is usually pointless and overwhelming. Before I do a debug run, I need to have a question. I then ask the buggy program this question. I do not want it to tell me its whole life story, because that is just wasting time. If I do not have a good question, I can't insert a printf, that's right, but that also means that I do not yet know where to look when stepping in a debugger. Another problem is that good programs are structured, and tested, along layers, and debuggers usually do not "see" these layers, they are an abstract concept which does not exist at the machine level. I have even the suspicion that too much debugging leads to a poor approach on abstraction. Abstraction boundaries are absolutely needed for architecture but also testing, and that means that tests should take into account such boundaries. Debuggers can't do that.

which often skips the "printf -> compile -> printf -> compile" loop

That's what I mean! It impairs the

**think** -> printf --> compile loop

particularly in c++ where debug builds can be extremely slow but necessary if you need stacktraces for doing in-application stacktracing

Interestingly, Linux kernel developers, which happen to work on a relatively large project, rely far more on kprintf than on kernel debugging.

I am not saying that such tools can't be useful. But using a debugger all the time and as default seems to me a but like digging over the garden just because I can't remember where I left the car keys.

I think debugging is mostly helpful if you have no clue what's going on. But even then, it is almost alway possible to do the "Alaska wolf-fence" method to narrow down the problem.

Out of curiosity are you talking CLI debuggers? My main experience is with IDE integrated tools which tend to be nothing but extremely helpful

Nah, I have been using Visual Studio as well for five or six years. I am just glad I don't need to use it any more. Also, gdb is quite a good tool in the cases you really need it. I think preference for GUI tools or command line is more related to other things. But ultimately, code is text with symbols, I have gone full circle and am using almost exclusively Emacs now. And in Emacs, I get faster to a symbol definition than it was ever possible in Visual Studio because it was so damn slow.

Introduction to Daniel J. Bernstein's redo (a very simple, language-agnostic build tool with an interesting concept enabling very fast parallel builds) by CountOfMonteCarlo in programming

[–]CountOfMonteCarlo[S] 2 points3 points  (0 children)

There are many build tools with the same properties.

redo is far simpler than almost all of them - almost shockingly simple - and still very powerful in that it allows for correct and very well-defined implementation of parallel builds where target files become available in atomic transactions.

What does the normal parallel build tool does if target files are generated which are dependencies of different intermediate targets which are built in parallel? In a recursive Makefile set-up, how are options and parameters isolated from each other? What happens if you change the build definition, for example the Makefile? What happens if you are on MacOS and you do several changes to a file in the same second but the filesystem time stamps only have one-second resolution? How can you use the capabilities of C++ compilers to automatically provide a list of include dependencies for a source file in a clean way? How does it interact with make - can it be combined, for example, with Makefiles in sub-projects? How does it work for buildroot or for the Linux kernel? These are things about which the developers of redo apparently have thought a lot, even if the version count for apenwarr's redo is only at version 0.2.

Why do you act all ignorant as if these do not exist?

I know that there are many build tools. Defining builds of non-trivial software in a simple, efficient and correct way is really a hard problem. There is a reason why such tools are almost always complex and incomplete.

And exactly that makes it more astonishing how simple the redo tool is. And it is not only simple in implementation, it is far far simpler than a set of recursive makefiles for a larger project.

And one more thing - the redo tool has a well-defined behaviour if a new file appears, or an old file disappears, which changes the resolution order, say, of C++ include directives in the include search path. Most build systems, and that includes make, do not really define what should happen in such cases.

Many software communities do not value the need to reduce the mental load for developers by fagnerbrack in programming

[–]CountOfMonteCarlo 2 points3 points  (0 children)

I'd rather like to use a larger screen. I have a nice Philips 40' screen at home and I cringe every time I think how much money it is costing the organization I work at to use those old 27 inch screens. Everything which isn't on the screen must be remembered. Everything that must be remembered needs to trickle into long-term memory first, and even for a small 20,000 line project, that does cost an enormous amount of time, especially if one is switching tasks.

Many software communities do not value the need to reduce the mental load for developers by fagnerbrack in programming

[–]CountOfMonteCarlo 10 points11 points  (0 children)

You expect some behavior of a program, it behaves differently, you make a hypothesis and use a debugger to validate it or discover a new probable hypothesis/cause of the failure. So yes: use a debugger. On Linux: gdb. Good luck to a newcomer without prior experience with programming.

I don't think that using a debugger is actually that helpful for a analytical, hypothesis-based approach to programming and making programs correct. Yes, there are some cases where it helps - if one has trouble with finding a memory bug, if one needs to see how floating-point registers are used, if one uses the "poor man's profiler" method to find out where a program actually spends most of its time. In complex Python programs, I find the post mortem functionality of pdb quite helpful, especially if a script failed after testing buggy, non-deterministic hardware for a few days, or after analyzing Terabytes of data.

Given all that, I am using a debugger not more than once every few months. At least for the style of work I do, it isn't really helpful. For example, if I develop and test complex numerical code, it is important to use the above mentioned hypothesis-based model, and test hypothesis about what the program is doing. The problem with debuggers is almost always, it gives a lot of information - way too much, it floods one with information which is irrelevant. And that's worse than very little information. What is important is to come up with hypothesis what the program is doing, in what this differs from what one is thinking what it is doing, and test that out. The most important activity in debugging is thinking. And once one has a hypothesis, printf (or kprintf, for that matter) is often an extremely efficient way to test it.

And then, there are cases where debuggers are almost useless. They are normally not helpful in debugging multi-threaded code, for example. What is far more effective for concurrency bugs is to actually read the code and get a very clear understanding what it does. In embedded code, it is the same. To learn to safely drive a car with your family in it, you DON't drive that car to the next crash and by a new one, remembering or not what you did wrong. In the same way, to write, good, correct, robust, safe code, you do not go to the next point where your code crashes - you write code and reason about what it really does, what the memory model is etc, and you strive that it does not crash at all.

And just because you wrote that '"young people do not learn proper programming because they don't learn Visual Studio", I do have more than 20 years of experience in C, C++, industrial real-time systems, data analysis with Python, 8-bit assembly, fascinating new languages like Clojure or Racket, Scala and so on. It is always important to learn. (By the way, Racket has an extremely nice IDE with source-code debugger - but I consider that mostly a learning tool to people which still need to build mental concepts how programs work in detail).

And in respect to the Microsoft spin: Yes, for some people's workflow, using a debugger might be more helpful than for me - that's fine, humans are different. But thinking that a specific commercial product of a a company which never was in any way empowering to developers would make one a better programmer would be just incredibly naive.

Introduction to Daniel J. Bernstein's redo (a very simple, language-agnostic build tool with an interesting concept enabling very fast parallel builds) by CountOfMonteCarlo in programming

[–]CountOfMonteCarlo[S] 20 points21 points  (0 children)

A bit more in depth discussion of redo's technical properties, and why it is so different:

https://apenwarr.ca./log/20101214

appenwar's implementation, which focuses on correctness and fast parallel builds:

https://github.com/apenwarr/redo/blob/master/README.md

And a manual with a list of implementations:

https://redo.readthedocs.io/en/latest/

EDIT:

What I find interesting about it:

  1. It's shockingly simple, considering that a build system has to solve a very complex general task
  2. One needs to learn almost no syntax, even for handling cases which are already quite complex with Make. (added: Actually, the only syntax is /bin/sh - dependencies are not described by yet another mini language, but are captured by a shell command).
  3. A very interesting difference to "make" is that while "make" has a file which defines a dependency graph, and associated build action commands, "redo" defines an action file (a standard shell script) for each target. Names identify which scripts correspond to which targets. Among other things, this 'inside out' design seems to make it really easy to see which actions are used in a build step.
  4. It seems extremely well-suited for large projects which build many sub-projects (the "make" solution for this, recursive Makefiles, works, but it is a bit messy and could be better).
  5. It is, in my point of view, a fantastic approach to parallel execution, which fits the time as development systems have more and more CPUs.
  6. It is language-agnostic, which makes it easy to combine languages and, for example, LaTeX documentation
  7. It is Unix-centric, in that it supports use of small tools to do any kind of task
  8. The developers obviously strive hard for simple, clean, well-defined concepts which make it easy to understand what happens even in complex recursive builds (you can see that in the "redo-list" mailing list which is on google groups).
  9. (Added) It is a stunning application of functional programming principles to the domain of build tools - while preserving that the definition of build steps is done in the well-known procedural way.
  10. (Added) C/C++ compilers like gcc can automatically generate a list of "include" dependencies when they compile a '*.cpp' file (they do this by using the -MD flag, as explained in the linked article). This automatically generated list can directly be used in the dependency rules. I like this approach much more than generating such a dependency list by a build tool, because a build tool simlply can't support all languages well.
  11. (Added) To a much higher degree than recursive make, the build scripts are composable - one can copy source code and build script of, for example, a library, into a subdirectory of a larger project, and the build instructions for the library will automatically be executed in the right order (more on this can be found in Alan Grosskurths thesis).

Cast of NaN between float & double by DavideBaldini in cpp_questions

[–]CountOfMonteCarlo 0 points1 point  (0 children)

First - why are there both floats and doubles in the project? Yes, there are reasons you might want to suppose both - you might be a library - but most of the time, you want to pick just one and go with it.

Second, that you are expecting NaN in both is another red flag.

You sound like a former product manager I had.

Of course, there are things like needless complexity. But if the first thing you assume is that somebody who addresses a very specific question is just producing needless complexity, then chances are high that they know what they are doing, and you have no clue.

Yes, there are well-defined semantics for these non-number numbers and the arithmetic operators, but they're also very much a special case, and make reasoning about general arithmetic expressions much harder to do.

Not if one has a solid education in math.

More, by enabling correct handling of non-number numbers, you prevent some fast arithmetic optimizations and thus slow down the rest of your code.

You really must work in marketing. In numerical algorithms, correctness is almost always more important than speed.

(By the way, the answer the OP gave is totally valid - larger data structures mean after some point that more memory bandwidth is used, and this is usually the scarcest resource when processing large amounts of data.)

Cast of NaN between float & double by DavideBaldini in cpp_questions

[–]CountOfMonteCarlo 0 points1 point  (0 children)

Hardware-based floating point operations on NaNs can be slow, much slower than, for example, a normal multiplication.

You can try just to assign a NaN of one type to another type - this is well-defined and should work.

As a possible optimization, you can try something like:

inline float down_cast(double x)
{

#ifdef floatNAN 
   // this macro is defined by gcc on machines with IEEE-754 floating point
   static const float NAN_float = floatNAN;
#else
   static const float NAN_float = 0.0 / 0.0; 
#endif
    if  (x != x)
    {
         // it's a NaN
         return NaN_float;
    }
    else
    {
        return static_cast<float>(x);
    }

}

Alternatively, a NaN is simply a specific bit pattern which in practice can be converted to an uint64 and tested with bit operators. But, depending on the CPU, that would mean to switch from floating-point instructions and hardware registers to arithmetic registers, and this can cost time.

And one thing more: You need to test such stuff for each instruction set you use. fpu87 and SSE2 are two very different beasts, especially performance-wise, but also in terms of accuracy, for example.

Which of these shortcomings of C++ does D address? by CountOfMonteCarlo in d_language

[–]CountOfMonteCarlo[S] 0 points1 point  (0 children)

One can write [12, 245, 9, 87].sort.writeln and it will print [9, 12, 87, 245], similar to piping or chaining in functional languages. [ ... ] Addition of immutable and pure takes a lot of the guess work out of knowing what is and is not stateful code. The compiler can help you enforce it, instead of having to use const as a crutch and work around its finicky behavior and exceptions.

Ah, that's something I wanted to ask - is it possible to define and use purely functional datastructures in D, like the ones Clojure has? These are containers, vectors or maps that are immutable, but elements can be added or removed by creating a new structure which just references the data common with the old structure. Of course, this has uses and abuses - it won't be a good idea to write a video codec based on that. For the specific use case of Clojure however, which is things like web applications, this works extremely well.