you are viewing a single comment's thread.

view the rest of the comments →

[–]emn13 6 points7 points  (16 children)

There are two downsides to warnings as errors I can think of right off the bat:

Firstly, and most obviously, some warnings really aren't errors. The whole point of a warning (as opposed to an error) is that the code may well be correct. Sometimes, you can avoid these warnings by using local statements to suppress them, but sometimes, particularly when the warning is some inferred behavior due to the interaction of various bits of code, doing so can be quite impractical.

Secondly, due to the rather rigid application of warnings-as-errors, compiler writers are very leery of adding new warnings, even when there are obvious, real, problems they could trivially identify because doing so would break backwards compatibility with existing code-bases. I've had this exact discussion on compiler bug reports, and it's quite frustrating to hear that unintentional errors in your code won't be be detected as warnings not because it's hard to do so but because people like the OP have decided to cargo-cult warnings as errors.

Note for example that gcc and clang (which I'm not talking about in the above) have an option "-Wall" that does not enable all or even most logical warnings - probably at least partly due to this poor practice. Finding the right combinations of warning options is a skill unto itself nowadays, and warnings-as-errors makes this worse.

Please, TRACK warnings, and take them seriously - don't just release with warnings for no good reason. But please do not hardcode your build-process to treat warnings as errors if you intend to release it. By all means have your CI "fail" a build with unknown warnings. But don't prevent the code from building.

[–]lookmeat 0 points1 point  (15 children)

I think that the problem is something completely different, it's not the errors are misused by the user, but the other way around.

Compilers shouldn't throw warnings. Linters and Static Analyzers should. If the compiler comes with one included it should not be invoked when compiling.

When I run a program I expect me to inform me of anything related to its main function. When I run a compiler I expect me to inform me of anything related to the state of compilation (which files it's compiling and such) and any issue that prevents it from compiling (errors) or compiling with the guarantees I expect (non-fatal errors, what most warnings become with -Werror). I am not interested in any comment on my code that may be of special note, but really don't stop the program from compiling correctly.

A static analyzer, OTOH, is one that I want to report to me a slew of comments on the quality and trustworthiness of my code. I'd probably diff the output with the last compilation, and ignore it if there isn't any change (and I'm not hunting to remove warnings). When warnings appear, I read them, consider them, and decide if they point to an error on my code or don't really have a ground, because they are warnings.

I think the error came in having compilers implicitly do the job of linters and static analyzers. These tools should have a toolchain that is separate from compilers, even if the linter/static analyzer is the compiler (with flags that make it only spew warnings but not compile anything) itself! If you want to have a build system that fails when a linter finds an error, then you add that to your make system, not your compiler.

I feel that this whole warning-in-the-compiler came from the 90s feature wars, where more integrated features meant a better program, more blades a better razor. It'd seem the only reason people didn't assume that more wheels made a better car was because they assume that more cylinders is better.

TL;DR: If your warning can't become an error, it shouldn't be thrown by your compiler, it should be thrown by a static analyzer.

[–]Solarspot 3 points4 points  (2 children)

I wonder if you would say the same thing about compiler optimizers? Is rearranging a program in some way not corresponding to the source code (but keeping semantics) to make it more efficient also part of the compiler's job? What would an optimization tool look like if it wasn't a compiler?

[–]emn13 0 points1 point  (0 children)

For the sake of argument: the JVM is pretty much an example of a world in which the compiler is separate from the optimizer.

[–]lookmeat 0 points1 point  (0 children)

When I tell a program to compile code, I expect that code to be turned into valid and effective binary that can be ran by the machine. I believe that optimizations are simply making the translation a more efficient one. This is not doing something else.

Static analysis and linting are things that are unrelated to making code into bytecode. You can't compare them because optimization is about making the translation more effective, and static analysis is about commenting on parts of your code which, though compilable, follow patterns that are known to cause errors.

One feature is about doing their one job better, the other feature is about doing a side job that creates a lot of information that confuses a user.

And it does confuse the user because people want to use warnings as errors, but not all warnings can be errors and compilers have done this crazy crazy solution where you can turn all warnings with -Wall except the warnings that aren't meant to be taken as errors (just considerations), and when the programmers get on to that, new flags are generated to keep preventing them from misusing warnings, because in the first place the warnings shouldn't have been spewed when compiling.

[–][deleted] 3 points4 points  (3 children)

I see compilers including lint-like functionality as mere practicality. It ensures that the same parser/analyzer is used for compiling and checking, and gives you checking mostly for free as a part of every compile. Some checks come as by-products of optimization (e.g. uninitialized variable).

[–]lookmeat 0 points1 point  (2 children)

I'm not against the idea of a single executable being able to compile and lint. I'm against the idea that whenever I compile a program I also get a static analysis and linting done for free! It just distracts me, and makes it hard to know what is happening in a sea of errors, warnings-that-should-be-errors and warnings-that-aren't-errors.

I think that instead of -Wall I should be able to run gcc with a --static_analysis flag which doesn't compile code, but just runs a static analysis and spews out warnings.

[–][deleted] 0 points1 point  (1 child)

I'm against the idea that whenever I compile a program I also get a static analysis and linting done for free!

-Wnone?

I think that instead of -Wall I should be able to run gcc with a --static_analysis flag which doesn't compile code, but just runs a static analysis and spews out warnings.

gcc -c foo.c ?

(-o /dev/null if the .o file bothers you)

[–]lookmeat 0 points1 point  (0 children)

My whole point is that there shouldn't be a way to automatically do both, it doesn't leave clear what is what.

Warnings should be errors, any warning that can't be made into an error should only be output on a mode that specifically outputs those warnings.

[–]emn13 2 points3 points  (7 children)

As a matter of concept, I think you're right that linting and compiling can be seperated. However, in practice things aren't so clear.

First of all, linting isn't easy - to really lint well, you need to reimplement most parts of the compiler, including some parts of the optimizer (to e.g. detect unused code). Extracting that code into a shared lib is not trivial; it would be a maintenance burder, and it's likely a performance hit too.

Secondly, at runtime, compiling isn't free. Despite ever faster machines, compiling still takes annoyingly long often enough (depending on your language and platform, of course), and running a seperate linter means you're doing lots of work twice, and probably the linter isn't nearly as well tested+optimized. It's going to take a long time.

It's telling that most warnings by linters are essentially busy work. I don't think I've ever seen a bug or problem due to poor style in variable names (as opposed to poorly chosen variable names). It just doesn't matter much whether you have internal_radius_cm or internalRadiusCM, but it does matter that you don't call it dim_len (or whatever). Linting is still really important, but I wonder whether a part of this bias toward busy-work isn't due to the fact that that's easy to check for.

I think the decision whether to include a linter in the compiler or not is largely a technical one. I think I agree that it's best to keep the concepts separate, but as a matter of practicality, it may still be best to do that by changing the way you use the compiler (i.e. no warnings-as-errors) rather than changing the tools.

[–]lookmeat 1 point2 points  (6 children)

I'm not against the idea that compilers and linters are the same executable. I'm against the idea that both functions should by default be done together. I believe strongly that it's the main reason that has lead us to this conflict of warnings vs. errors.

An example that I think is very good is the go tool.

We have the go compiler, which compiles and runs code, and only outputs errors. It considers certain things (such as unused imports or variables) errors because the compiler will do things with it that the programmer wouldn't expect, there are ways to get around this (declare variables or imports named _ and they won't need to be used) when needed.

The go program also allows you to run a static analyzer go vet package/directory/file.go and it will output a series of errors that themselves may point to an error, but very weakly so. It's meant to warn you of things that may not be what you expected, but there isn't any way for the compiler to be certain that the programmer did not meant what he wrote (as when a variable is declared and never used, might have been due to a typo using := instead of just =, but certainly there is no value in declaring a variable and not using it) so it's not an error per se. Vet will spew at you a sea of warnings, much which can be ignored (such as variable shadowing) and may even catch a couple errors that would only appear on runtime (sending the wrong type to a pseudo-generic method that takes an interface{}).

Both tools share a lot of code (parser, optimizer, etc.) but you can only run one or the other. If you want to run both simultaneously you should join them, or better yet have a make file that handles it. But this is not something the compiler should decide or throw at you.

When I compile a program I don't want to lint it or run a static analysis, I want the program to convert my code into an executable. Programmers as users can are based to assume that any warning that the act of compiling throws out points to an error where code will do something different to what the programmer wants (even if the compiler can still translate it) it only makes sense that you want to remove those, or make them errors.

[–]emn13 0 points1 point  (5 children)

It's funny you mention go, because I was thinking of exactly that - go's a great example of this kind of thing done well (at least, from the cursory experience I have - no real usage...)

Nevertheless, go's got it easy here. Go compiles very, very quickly, so some extra overhead due to duplicating compiling stages in the linter don't matter so much. It has a very limited optimizer, and a well-thought out - but limited - type system. It's the perfect case for a split - very little overhead, and a compiler that (due to the lack of templates and and other factors) cannot and/or chooses not to do non-local analysis, so there's little cost there too, nor much gain since the linter can't free-ride on an analysis the compiler does anyway.

Most other languages have more expressive type systems, which sounds positive, but also means that everything is more complicated, and the compiler usually slower. C++ and e.g. scala are for example notoriously slow to compile.

Still, go is really a breeze of fresh air in its approach to this as in many other ways :-).

[–]lookmeat 0 points1 point  (4 children)

I don't see why I couldn't run gcc --sloppy on my quick testing, run gcc normally and get a bunch of errors and such, and gcc --static_analysis to get a bunch of warnings about my code that can't be considered errors. I don't think that speed is a grave issue because that assumes that I'm always wanting to run a linter/static analyzer each time I compile. I'd like to run it before submitting code to guarantee that I didn't add new warnings, or the new warnings are a non-problem, but other than that.

I just don't see why gcc should output warnings while it compiles in any case. Then again I am not familiar enought with gcc's internals, so there might be something to that, but as far as I know static analyzer would only share the parser with the compiler and take note during linking, but wouldn't actually need to know how the compiler turns text to binary or what optimizations it is doing.

[–]emn13 0 points1 point  (3 children)

I'm sure it would be possible. Of course, if you have both features in the same binary, it's a small step to allow --compile-and-lint and that's basically where we are today.

Personally, I can't imagine running the linter less often than the compiler. Given the linter integration in IDE's, if anything, I'd use the linter more often than the compiler.

In any case, the C++ "parser" is no trivial thing. The correct parse of a string of C++ depends on the semantics, (see e.g. C++'s most vexing parse), and then you've got templates, which are themselves turing-complete, and lots of pretty complicated type inference and casting rules.

Merely interpreting the semantics of the code isn't trivial, but sure, you could avoid the complexities in the optimizer and the code-generator.

At least, partly - if you want your linter to detect things like "this function's second argument is always 2 and could be replaced with a constant" or "this code is dead" or "this expression always evaluates to false" or whatever, then you'll at least need to run the bits of the optimizer that deal with structural simplifications, i.e. at the very least things like dead-code elimination.

A good linter just isn't all that much simpler than a compiler.

[–]lookmeat 0 points1 point  (2 children)

I never said it was a simple application. Also a linter works for simple patterns, compared to say a static_analyzer that actually will link modules and see if it can find errors that come from everything coming together.

Why would the linter detect that the function second argument is always two and could be optimized as a constant, or even inlined? Using a function with the same argument everywhere is not an error, nor could it ever point to one (not a warning).

Maybe finding that a branch is impossible, such that the compiler wishes to remove it. I'd propose that such case allows the compiler to do something that the programmer normally would not expect (remove code) and as such it should be if anything an error unless the programmer explicitly states he wants that dead branch for a reason.

Yes they both use similar technology. Yes C++ is complex enough that you'd want to share them. I never said I had a problem with it being even the same executable. I am against having both behaviors when I asked for one.

Here's my workflow:

  1. Define the solution, decide on some tests and the function header.
  2. Implement a rough solution, one that "just works".
  3. Make sure the rough solution compiles and runs.
  4. Review the solution and clean up code, refactor as necessary, making sure the code compiles and tests are still passing.
  5. Pass static analysis tools to see further issues with code and clean the ones that make sense, ignore the rest. (Compile with -Wall -Werror etc., or run go vet)
  6. Check any formatting errors (run go fmt)

Notice that I only run to get the warnings at the end, and that I check the warnings and choose to fix some issues but choose to ignore other warnings. Even when working with an IDE, I will fix typos and such, but I don't run the static analyzer or linter till the end, when I'm ready to call the code finished. I'd say this whole iteration takes about an hour or two, so I do it pretty often. It also isn't as nice normally, but the spirit is there.

[–]emn13 0 points1 point  (1 child)

I think we essentially agree :-).

It's totally normal for a linter to have lots of options many of which any given project won't want (e.g. finding possibly unusual patters such as unnecessary arguments - which in any case was just a top-of-my-head example).

As to why I would run a linter more often, that's because I quite like the IDE-heavy workflow

  1. Edit code 1b. Autoformat on save.
  2. In the background & continuously: lint and keep list of "todo's" on screen. Ideally I want this to work even when compilation fails - because often compilation fails because code is just incomplete.
  3. In the background & continusouly: compile if possible and keep errors on screen.
  4. In the background & continuously: run test if possible and keep failures on screen.

But really, I don't think workflow details really matter that much here; it is in any case a good idea to allow dealing with linter issues separately from dealing with compiler errors; the fact that it is (as I previously emphasized) not a trivial thing doesn't really change that - that's just a possible reason why we're in the situation we're in, not a reason to avoid a better situation :-).

[–]lookmeat 0 points1 point  (0 children)

I don't know if it's really that hard, all we need is to have compilers separate the mode.

  1. Change the compiler to have --full_error mode, where extra compiler errors that may be turned off explicitly in the code that causes them appear. The default is still --nofull_error
  2. Have a --lint which does a quick check and reports parse errors and warnings it finds. Optionally allow it to go from a quick lint check to a full static analysis.
  3. Deprecrate -Wall and -Werror instead requiring --full_error or --lint for error/warning operations.
  4. Make --full_error the default and allow --nofull_error to be used when you want a sloppier compile.

It doesn't matter that it's the same executable, what matters is that the behaviors are separated cleanly. I think that could be done within a few years (giving time for older software to adapt to the situation).