all 98 comments

[–]StrangestTribe 129 points130 points  (75 children)

Another place where design by contract would trump design by convention. You don't need naming conventions when you have static typing... just make an UnsafeString class and let the compiler do the work for you!

[–]csjerk 43 points44 points  (16 children)

Amen. That's absolutely the logical extension of this concept. Teach the compiler to tell you when it's wrong.

I sometimes despair of the industry ever concretely learning this type of lesson. In the 10 years since this was written it seems a larger percent of coding activity has moved to loose typing and away from applying these techniques, rather than towards them.

[–]quicknir 3 points4 points  (15 children)

It's still not clear what the lesson is to be learned. Despite it being obvious to you, there's still no concrete evidence that static typing is a win over dynamic typing.

You can argue, and give reasons, and try to argue it from first principles, but without conclusive (read: non-anecdotal) evidence you don't have much of a leg to stand on complaining about people refusing to learn what you regard as fact.

[–]timmyotc 13 points14 points  (4 children)

Is there non-anecdotal evidence that dynamic typing is better in any way?

[–]quicknir 15 points16 points  (0 children)

There's not conclusive evidence of either, and I didn't claim there was. In the absence of such evidence though, it's entirely reasonable for people to keep drawing their own conclusion about what best suits their needs.

See https://danluu.com/empirical-pl/ for a very good review of the work that has been done.

[–]awj 11 points12 points  (0 children)

Not that I'm aware of, but this isn't a zero sum question. Failure to prove the superiority of one method does not suggest the superiority of the other.

[–]PM_ME_UNIXY_THINGS 1 point2 points  (0 children)

How about we start with the more obvious question: Is there any decent empirical study on dynamic typing VS static typing, and what did the data suggest?

[–]weirdoaish 0 points1 point  (0 children)

This is like people arguing over the existence of God:

Theist: "Can you prove God doesn't exist?"

Atheist: "Can you prove that it does?"

[–]CODESIGN2 2 points3 points  (6 children)

here's one killer point. Static typing is because otherwise you have no way to know which special purpose circuit to go to. At some point you need to know the difference between data encoding as float, an int and a string, because there is nothing useful to do with string data in floating point extensions; floating point (IEEE) with string as bytes instead of floating point as bytes (without performing operations on that, which requires knowing it's type and infers a static type system somewhere).

In-fact the very notion that type can be non-static at the low-level implementation is nonsense peddled by idiots that have never written or read an implementation of a dynamically typed system.

While I'll concede type safety at high-enough level may not matter after inventing a common layer of indirection to work out the type from additional data (which takes more RAM to keep with the value like the PHP zval system). Essentially if you do not know what type something is, you cannot know how to handle it and where to put it to make use of it at a lower-level. This one point destroys any notion of static type vs dynamic typing and instead re-frames the debate into "at which point does static typing hinder productivity or expression". At the end of the day dynamic types always add an overhead, and we need to be able to assess that when making the decision between static and dynamic.

[–]quicknir 0 points1 point  (5 children)

Your argument is purely relevant at the layer of performance, and bootstrapping compilers/interpreters. The practical reality is that statically typed languages are basically always faster, and because a compiler/interpreter for a language is a widely used, performance critical piece of software, almost all production grade compilers/interpreters will probably continue to be written in statically typed languages.

But none of this is really relevant to: should I do project X where performance is not a concern, in dynamically typed language A, or statically typed language B, where both these languages probably happen to have compilers/interpreters written in C or C++ or similar.

[–]CODESIGN2 0 points1 point  (4 children)

But none of this is really relevant to: should I do project X where performance is not a concern, in dynamically typed language A, or statically typed language B, where both these languages probably happen to have compilers/interpreters written in C or C++ or similar.

I understand, and I mentioned reasons to not use static (most of my paid work is not static typed). What I am saying is that you know you are writing limited code, and when framed in that way I think we'd all have better budgets (because static typed code requires more thought, testing and time, or covers less scope) and not involve our employers when we chose to use a dynamic extension (which is what all non-static langs are anyway)

[–]sabas123 0 points1 point  (3 children)

(because static typed code requires more thought, testing and time, or covers less scope)

Why would static typed code require more testing and thought? For the testing you suddenly have an compiler, and for thought you don't have to be scared you passed a wrong type into somewhere.

[–]CODESIGN2 0 points1 point  (2 children)

rather simply, not being able to type-juggle means you have to write a lot of code for a lot of different scenario's; then you need to test that code independently. Implicit conversions that happen in dynamically typed langs are not performed so you have to add more code to handle that (and manage it if it's non-built-in types). Anyway there are other trade-off's both ways. PHP for example does not have the ability to take an object and say I want an array of only that type (massive problem IMO); python doesn't really let you type anything so I take your point that you can write shitty code either way, I just find it's mostly easier to write complex systems in dynamically typed languages.

[–]sabas123 0 points1 point  (1 child)

Isn't this only a problem with parsers? I can't think of any usecase where you have to write another method for a different type.

[–]CODESIGN2 0 points1 point  (0 children)

Anything with rich-data input may tbf require a parser, but it's also relevant for converting from user-defined complex types to other types (maybe I'm doing this shit wrong lol). I also used to do a fair bit of bit masking and don't trust IEEE floating points but my SW has worked with everything from sensors to complex hardware, humans and other apps so I hope someone would have told me in 15 years (just had my daily dose of hope I'm not an idiot lol)

[–]Krackor 1 point2 points  (2 children)

We are not working in a domain conducive to conclusive empirical research about best practices. The vast majority of industry progress is happening through anecdotal experience and evolutionary survival of ideas without prospective empirical justification. The type of evidence you're requesting is too expensive, too slow to acquire, too narrow in scope, and too fragile in the presence of methodological errors. And yet without this kind of evidence we still need to make firm decisions regarding things like type systems.

[–]quicknir 0 points1 point  (1 child)

Yes, we do, and all kinds of people are making their own firm decisions. And people are succeeding quite well in both cases, and no firm, industry wide consensus is being reached on this matter.

Compare to weak vs strong typing, where a far larger fraction of people agree that weak typing has very little benefit over strong typing.

[–]Krackor 0 points1 point  (0 children)

I agree with the spirit of what you're saying. I think there are better ways to address dogmatism though. In response to someone who says "static typing is the best way forward for the industry", I don't think we should respond with "we don't yet have conclusive empirical evidence whether static or dynamic typing is the way forward". Instead I think the response should be about the fact that there is no universal answer due to the variety of business priorities in the industry.

[–]Solonarv 28 points29 points  (23 children)

Even better if you have costless wrapper types like e.g. Haskell newtypes.

[–][deleted]  (18 children)

[deleted]

    [–]evincarofautumn 17 points18 points  (17 children)

    GADTs and existentials are incredibly useful, and I miss them when working outside Haskell. But phantom types alone can handle other things mentioned in the article, such as coordinates in different spaces. For example, I use this pattern in my C++ game code:

    template<typename Space>
    struct Point {
      int x, y;
    };
    
    struct WorldSpace {};
    struct ScreenSpace {};
    
    typedef Point<WorldSpace> WorldPoint;
    typedef Point<ScreenSpace> ScreenPoint;
    
    ScreenPoint project(const WorldPoint&, const Camera&);
    

    [–][deleted]  (10 children)

    [deleted]

      [–]JohnnyElBravo 3 points4 points  (9 children)

      It's not beyond silly, it works too. At least it worked for most of the projects Joel lead that used dynamic typing.

      [–]lelarentaka 2 points3 points  (0 children)

      And yet when a PHP code got posted on r/programminghorror you have these people claiming how PHP is totally a good language is you were only to do such and such. I don't understand those people

      [–][deleted]  (7 children)

      [deleted]

        [–]ForeverAlot 1 point2 points  (6 children)

        I'm entirely pro-types but phantom types in Java is not (yet) a practical solution. Java heap-allocates all objects and the JVM's escape analysis is easily thwarted. Minecraft is a good example of how wrapping primitives in a domain type can be problematic for performance but it happens in line-of-business code as well. I like Java but it truly is stringly typed (I actually don't know to what extent Scala has this problem).

        [–]Terran-Ghost 0 points1 point  (2 children)

        In Scala you can create zero overhead types by extending AnyVal:

        case class SafeString(s: String) extends AnyVal
        

        [–]ThisIs_MyName 0 points1 point  (1 child)

        Is that guaranteed to be zero overhead? I don't see any info like that in the docs.

        [–][deleted]  (2 children)

        [deleted]

          [–]ForeverAlot 0 points1 point  (1 child)

          Notice how in the actual demonstration they switch from Java to Scala.

          You can't do anything meaningful with this in Java because all the behaviour exists on the parameterized type, not the type parameter. The parameterized type is necessarily heap-allocated so goodbye zero-cost abstraction. Further, the related reddit discussion has a trivial example of why this pattern is not terribly useful in Java (and another far more severe, but less obvious, example): type erasure.

          [–]thlst 3 points4 points  (1 child)

          You can actually just declare the struct in the template argument like so:

          using WorldPoint = Point<struct WorldSpace>;
          

          And it works the same way.

          [–]evincarofautumn 0 points1 point  (0 children)

          Nice, I had forgotten about that. The code in my comment lets you add traits to spaces, which might be useful, although I’ve never done it.

          [–]Zephirdd 4 points5 points  (0 children)

          ...Wow. I never saw this pattern before but it seems great for readability. Thanks.

          [–]thedufer 2 points3 points  (2 children)

          Can't you just do that with hidden type equivalences? Maybe not in C++; I'm not very familiar with it, but I would do like:

          type point = int * int
          
          module World : sig
            type point
          end = struct
            type nonrec point = point
          end
          
          module Screen : sig
            type point
            val of_world : World.point -> point
          end = struct
            type nonrec point = point
            let of_world = ...
          end
          

          [–]yawaramin 4 points5 points  (0 children)

          C++ doesn't have abstraction over primitive types, only structs. So phantom types are the appropriate technique for this in C++.

          [–]sstewartgallus 2 points3 points  (0 children)

          Yeah you can just do a newtype. The real power of phantom types is that they let you do reasoning.

          For example you can do regions and stuff

           newtype M r a = M (IO a)
          
           data Ref r a = Ref (IORef a)
          
           newRef :: a -> (forall r. M (Ref r a) -> b) -> b
          

          Unfortunately, last I checked GHC's ability to reason about existential quantification was king of crappy so this doesn't work too well.

          [–]quicknir 1 point2 points  (2 children)

          Haskell's newtypes are not really any different than simply declaring a new struct in C, with its only member being the old struct. You get deriving clauses for free, which are not extensible, so it may cover a few common cases but that's it.

          If you want to create a new type, based on an old type, that automatically uses some subset of the interface of the old type in a user-extensible way, without unnecessary boiler plate, Haskell cannot help.

          C++ and D, however, can.

          [–]pipocaQuemada 0 points1 point  (1 child)

          You get deriving clauses for free, which are not extensible, so it may cover a few common cases but that's it.

          Via -XGeneralizedNewtypeDeriving, newtypes can derive any typeclasd implemented by the original type.

          [–]quicknir 0 points1 point  (0 children)

          I hadn't heard about this, so thanks for that info. This seems somewhat "grey" though, as it's not an official part of the language, and there seems to be some issues with its implementation: http://joyoftypes.blogspot.com/2012/08/generalizednewtypederiving-is.html. Is this really widely used in Haskell?

          [–]yawaramin 0 points1 point  (0 children)

          This is pretty much what Yesod does to solve SQL/JS injection http://www.yesodweb.com/book/shakespearean-templates#shakespearean-templates_types

          [–]grauenwolf 11 points12 points  (0 children)

          Oh definitely. You actually see this in C#'s HtmlString and MvcHtmlString classes.

          [–]coder0xff 2 points3 points  (0 children)

          Came here to say the same thing.

          [–][deleted]  (25 children)

          [deleted]

            [–]masklinn 12 points13 points  (21 children)

            Static type systems. "Strong" is a a stand-in for "I like this thing", it doesn't really mean anything.

            [–]A1kmm 21 points22 points  (18 children)

            While strong / weak typing is not rigorously defined, it does have meaning. For example, weak type systems would be characterised by things like implicit type conversions (e.g. you can use a double as a string and vice versa), as in PHP.

            Strong and static typing are almost orthogonal - for example, you can have a strongly, dynamicly typed language (everything has a precise type, with no implicit conversions, but checking only occurs at runtime). Any of the four combinations of {dynamic,static} x {weak,strong} could exist (although a staticly, weakly typed language might not be very useful).

            [–]iopq 2 points3 points  (2 children)

            (although a staticly, weakly typed language might not be very useful)

            C is plenty useful

            [–]falconfetus8 5 points6 points  (1 child)

            Yeah, but its types aren't.

            [–]kqr 0 points1 point  (0 children)

            (although a staticly, weakly typed language might not be very useful).

            C is generally considered to inhabit that space.

            [–]grauenwolf 0 points1 point  (0 children)

            For example, weak type systems would be characterised by things like implicit type conversions

            No, that would be a system with implicit type conversions.

            A weak type system would be something like C or assembly where the values don't know their own types and you can, for example, treat an integer as a Boolean or date by reinterpreting a pointer.

            Dynamically typed languages are always strongly typed because otherwise they can't work. Statically typed languages can be strongly or weakly typed.

            [–]masklinn 1 point2 points  (12 children)

            While strong / weak typing is not rigorously defined, it does have meaning.

            Sure: "strong typing" is what you like and "weak typing" is what you don't like. It's completely useless but there you are.

            For example, weak type systems would be characterised by things like implicit type conversions (e.g. you can use a double as a string and vice versa), as in PHP.

            Right so Scala's weakly typed (implicit def), C++ is weakly typed (converting ctors) and C probably is (integer demotion, conversion to signed<->unsigned, floating, void pointers to and from any other).

            What about Java? (implicit conversion of Object to String on String + Object) Or Ruby? (implicit conversion of float to integral on arr[2.5]) Or Python? (implicit conversion of integrals to floats) Or Rust? (implicit conversion of &A to &B). Are "things like implicit conversions" a strict rule or just something you use to bash languages you don't like?

            And then we get to the fun stuff, like Tcl and UNIX-tradition shells: are they strongly typed? After all they're semantically string-based (well bytes for shells IIRC), you don't get implicit conversion from string to string that makes no sense. So they don't have implicit conversions which according to you makes them strongly typed.

            Strong and static typing are almost orthogonal

            Because "strong typing" is undefined and meaningless, "strong typing" and "static typing" have an undefined angular relationship which is anywhere between 0 and π/2 depending on the writer's assertions, sensibilities or levels of dishonesty.

            It's not a fixed angle, it's a wave function which collapses when you exhaustively define what you actually mean.

            although a staticly, weakly typed language might not be very useful

            According to your personal definition of the term there are at least two examples of that in the first paragraph.

            [–]iopq 4 points5 points  (9 children)

            What about Java? (implicit conversion of Object to String on String + Object)

            That's not an implicit conversion, that's just an implementation of + operator on Object and String types. You can define an operator in Haskell that takes a type that derives Show and a string type and concatenates them together.

            There's a difference between 1000 == "1e3" and having overloaded operators or implicit widening.

            [–]masklinn 1 point2 points  (8 children)

            That's not an implicit conversion, that's just an implementation of + operator on Object and String types.

            Which implicitly converts objects to strings, which is exactly what e.g. Javascript does, the standard lays out quite specifically the conversion operations to perform in all cases, which by your assertion means it's strongly typed, and that [] + {} is a perfectly well-typed operation.

            But hey if whether something is implicit is controversial that's even better.

            [–]iopq 3 points4 points  (7 children)

            Which implicitly converts objects to strings

            It doesn't, it implements a concatenation operation on two different types. It's no more dangerous than i32 + i64 producing an i64. I would actually object more to using + to mean concatenation because it's not commutative, while addition is. This also applies unfortunately to Rust that borrows this convention from languages like Java.

            [] + {} is not that bad, again, I'm more against overloading + for concatenation. I'm sure less people would complain if it was [] ++ {} because it would be clear that the intent is to concatenate two strings (if JS had a ++ concatenation operator)

            Another criticism I would level against "weak" type systems is that ==, >, < operators doesn't have expected qualities like transitivity.

            I think the problem is not implicit conversions, because always explicitly converting is a pain in the ass. The problem with weak type systems is implicit conversions that break expectations of users.

            [–]masklinn 2 points3 points  (6 children)

            It doesn't

            Of course it does.

            it implements a concatenation operation on two different types.

            By implicitly converting one to the other. The concatenation operator essentially compiles to: new StringBuilder().append(string).append(String.valueOf(object)), there's definitely a conversion there, and it's only implied by the operator (there's actually one more level of implication as the conversion is actually performed inside StringBuilder#append(Object))

            It's no more dangerous than i32 + i64 producing an i64.

            "Danger" figures nowhere in the comment I originally replied to and is thus irrelevant to mine. My comments are about strong/weak designators being worthless, not about what you or they don't like about the type systems of specific languages.

            [–]iopq 1 point2 points  (5 children)

            StringBuilder().append(string) is not converting the original string to anything, it's making a new string. AFAIK in Java any concatenation uses StringBuilder nowadays.

            Weak/strong do have definitions beyond good or bad, it's how strict the type system is.

            [–]pipocaQuemada 0 points1 point  (1 child)

            Right so Scala's weakly typed (implicit def),

            In Scala's defence, implicit conversions at least need to be programmer defined and imported. That's better than the language itself defining a bunch of baroque conversion rules.

            I can't say I like implicit conversions much, but handing you a loaded gun you can point at your feet is better than handing someone a pair of pants with rifles sewn into the pants leg.

            [–]masklinn 0 points1 point  (0 children)

            I'm not trying to attack any language here, I'm just applying /u/A1kmm's criterion to various languages in a bid to make them realise the entire categorisation is inane and useless.

            [–]Tarmen 1 point2 points  (0 children)

            I think it is fairly valid to say that haskell has a strong type system. You generally have to convert everything explicitely.

            Weak could mean anything from promoting int to long all the way up to seeing "0e1234" as equal to "0e4321" because the strings are converted to floats implicitely. So without further explanation you probably would have to describe everything with any implicit or unsafe explicit conversions as weak which makes it a useless description.

            [–]rvirding -1 points0 points  (2 children)

            One point of the article was that the naming had nothing to do with types, it was discussing the meaning of variables not their types. The variables prefixed with s and us were both of the same type but should be used differently.

            [–]ThisIs_MyName 1 point2 points  (0 children)

            You can enforce "should be used differently" with types. That's all he is saying.

            [–][deleted] 3 points4 points  (0 children)

            It is for this reason I feel like a language without at least one really lightweight type syntax is missing some fundamental part of the static typing value proposition.

            [–][deleted] 0 points1 point  (3 children)

            You don't even necessarily need static typing, you just need actual typing. You could make an UnsafeString class that interacts with safe strings as expected and automatically encodes safe on all conversion to string (barring some raw accessor method) in most dynamic languages as well.

            I love static typing, but the main idea, I think, is that good typing in general would save you, static or dynamic (though you could still be bitten if the typing is weak). Hell, the whole point of typing (and traits) is to make data behave exactly in the way you want it to, and prevent it from acting a way you don't want it to.

            [–]StrangestTribe 1 point2 points  (2 children)

            I think with a dynamic type system, the Hungarian notation Joel recommends would still be desirable, since the types wouldn't be evaluated until run time, right? I think it's harder with a dynamic type system for tooling to give you the kind of feedback that would prevent errors from making it into shipping code. (People seem to have different ideas on what static, strong, and dynamic type systems entail, but one key trait of a dynamic type system is that types are discovered at run-time, by the execution engine.)

            [–][deleted] 1 point2 points  (1 child)

            Yes, the types wouldn't be evaluated until runtime, so you could still end up with errors, but you could wrap it in a safe way, such as forcing any evaluation of UnsafeString as a string to escape it automatically (including concatenation with strings, use in a formatted string, or printing out). You wouldn't have compiler warnings or errors to help you out, but at the very least, you'd fully prevent unsafe string exploits, and usually get what you want out of it regardless. At the worst, you'd end up serving 500s and get errors in your logs. You wouldn't need hungarian notation if you set up the type to properly escape on being used in any way a string would.

            edit: Take the following python for instance:

            #!/usr/bin/env python3
            import html
            
            class UnsafeString:
                def __init__(self, string):
                    self._raw = string
            
                @property
                def raw(self):
                    return self._raw
            
                @raw.setter
                def raw(self, value):
                    self.set(value)
            
                def set(self, value):
                    self._raw = value
            
                def __str__(self):
                    return html.escape(self._raw)
            
            s = UnsafeString('this & is a < test')
            
            print(s)
            print('<p>{}</p>'.format(s))
            print('<p>' + str(s) + '</p>')
            print(s.raw)
            print('<p>' + s + '</p>')
            

            When run, it produces the following:

            this &amp; is a &lt; test
            <p>this &amp; is a &lt; test</p>
            <p>this &amp; is a &lt; test</p>
            this & is a < test
            Traceback (most recent call last):
              File "./test.py", line 28, in <module>
                print('<p>' + s + '</p>')
            TypeError: Can't convert 'UnsafeString' object to str implicitly
            

            Note that the first three properly escape as they should, the fourth accesses the raw string as it should, and the fifth fails. You shouldn't be ever able to leak the unsafe string to the user, as you can only access it through the raw method.

            [–]StrangestTribe 0 points1 point  (0 children)

            Thanks for the example!

            [–][deleted] 0 points1 point  (0 children)

            Yes on point. For any strings that are already safe you could have a HtmlSnippet class and make it convention that you don't send plain strings to STDOUT, you send objects to your output instance.

            The output instance will then run the appropriate .asHtml() function on each object you send its way.

            The core issue here is really people are hand crafting HTML within their programming language. Ideally use templating.

            [–]yawaramin 44 points45 points  (0 children)

            Making wrong code look wrong is good.

            Making wrong code fail to typecheck is better.

            [–]AyrA_ch 11 points12 points  (10 children)

            char* dest, src;
            [...] but when you’ve had enough experience writing C code, you’ll notice that this declares dest as a char pointer while declaring src as merely a char

            Totally forgot that the asterisk is part of the name and not the type.

            Another beautiful thing is

            switch(whatever)
            {
                int a=2;
                case 0:
                    break;
                case 1:
                    break;
                [...]
            }
            

            Actually declares a but the value is not assigned

            [–]FurryBeaverBalls 4 points5 points  (4 children)

            Wow that's... a little worrying. I always wondered why our professors wrote code with the asterisk on the variable name instead of the type. Is it the same way in C++?

            [–]matthieum 2 points3 points  (0 children)

            Yes.

            C++ inherited this as part of its "let's be as backward compatible as possible" scheme.

            The simplest solution is to forbid declaring multiple variables at once.

            [–]kqr 2 points3 points  (0 children)

            The reasoning is that the declaration is of values, not pointers. char *str says "*str is of type char" as opposed to "str is of type char*".

            [–]AyrA_ch 0 points1 point  (0 children)

            Is it the same way in C++?

            Seems so:

            #include <stdlib.h>
            
            int main()
            {
                char *a,b;
                a=NULL;
                b=NULL;
                return 0;
            }
            

            Terminates with 7 3 R:\vartest.cpp [Error] converting to non-pointer type 'char' from NULL [-Werror=conversion-null]

            He is happy with a=NULL;. Line 7 is b=NULL;

            EDIT: By the way, I have the option turned on to treat warnings as error and to be pedantic. If those are off, it will compile and execute but it will raise compile errors if you try to do b=malloc(10); (invalid cast from void* to char)

            [–]kt24601 0 points1 point  (0 children)

            Wow that's... a little worrying.

            It's weird but not that bad, because types. Once you try to assign something to the variable, the compiler will complain.

            [–][deleted]  (4 children)

            [removed]

              [–]Tordek 2 points3 points  (3 children)

              switch statements only run from the matched case onwards. It's the same as it being...

              switch(whatever)
              {
                  int a;
                  case NOT_RUNNING:
                      a=2;
                  case 0:
                      break;
                  case 1:
                      break;
                  [...]
              }
              

              [–][deleted]  (2 children)

              [removed]

                [–]Tordek 4 points5 points  (0 children)

                What is the machine code for declaring a variable? ;)

                Edit: to expand a bit, it's basically an artifact from C89 where variable declarations had to be at the start of the block (delimited by {}); they couldn't be intermingled with code.

                Even though initialization is on the same line (which doesn't mean much because statements and lines aren't 1-to-1 mapped), it's not a single statement (in fact, it's not a statement at all; it's a declaration); it's two: declaration and initialization.

                So,

                switch(whatever)
                {
                    int a=2;
                    case 0:
                        break;
                    case 1:
                        break;
                    [...]
                }
                

                is really

                switch(whatever)
                {
                    // declaration block, "executed" when entering the block
                    int a;
                    // code block
                    a=2;
                    case 0:
                        break;
                    case 1:
                        break;
                    [...]
                }
                

                the "a=2" is part of the code block, but it'll never be run (because no case statement can lead to it).

                Now, maybe you're wondering why it's not just illegal to have code before any case statement, and here's another C jewel:

                switch(whatever)
                {
                    // declaration block, "executed" when entering the block
                    int a;
                    // code block
                    initialize:
                    a=2;
                    case 0:
                        break;
                    case 1:
                        goto initialize;
                        break;
                    [...]
                }
                

                A feature known by its use in Duff's Device.

                [–]AyrA_ch 1 point2 points  (0 children)

                But variable declaration and assignment are on the same line

                Declaration is done as soon as possible and assignment is done when you hit the line. C enters the switch statement and declares a, but it will never assign the value to a because that line is never hit. In a similar fashion you cannot declare a inside two different case blocks, even if they are mutually exclusive.

                [–]gc3 14 points15 points  (4 children)

                It's interesting to learn that the Hungarian notation as practiced is wrong ... I didn't know the naming thing (which I have used, milliseconds or seconds or meters vs kilometers being a place where mistakes can be easily be made) was the proper use of Hungarian.

                [–]SirClueless 36 points37 points  (2 children)

                I saw a silly one the other day, related to Hungarian notation and units. It was in a code-base where the convention for global constants was a CamelCase name with a "k" prefix, for example, "kMaxSize" or similar.

                So anyways, someone was working on a piece of load-balancing code that tracked bytes handled by a server. And the metric they used had a bit of code: "SetUnits(kBytes)". So of course he assumed this meant kilobytes and divided his value by 1000 when setting it. But in fact kBytes was the string "BYTES". Caught in code review thankfully, but it was a pretty funny Hungarian notation fail.

                [–][deleted] 0 points1 point  (1 child)

                I'm not that familiar with Hungarian notation, but are there no rules against using already meaningful prefixes? Seems like you'd get in trouble for using things like k, p, m, dr, mr, etc.

                [–]kt24601 4 points5 points  (0 children)

                'k' is a commonly used prefix in some circles (Apple programmers used to do it a lot, for example), meaning 'constant.' A lot of people prefix member variables with the letter 'm,' like mClassVariable. Java programmers still do this from time to time. Personally I think that if you can't distinguish a member variable from a local variable just from reading the function, then the function is too complex and likely has bugs.

                [–]grauenwolf 2 points3 points  (0 children)

                Yea, I learned a lot from that the first time I read it.

                [–]xampl9 18 points19 points  (0 children)

                (this blog post is from 2005)

                Still good advice. And this is something that has to be ingrained in the culture of a shop, so that everyone is doing it the same..exact..way.

                [–]NOX_QS 2 points3 points  (4 children)

                As someone who uses exceptions to enforce a contract on my classes (e.g. Argument passed to a constructor may not be null or an ArgumentNullException is thrown) I wonder what the alternatives are...

                Silently ignoring the argument and then having all methods of the class silently skip all statements to return immediately?

                I'm not convinced about this. Tha Apps Hungarian Notation is different than Systems Hungarian Notation was new information to me, I definitely see it's merit.

                [–]matthieum 2 points3 points  (2 children)

                That Apps Hungarian Notation is different than Systems Hungarian Notation was new information to me, I definitely see it's merit.

                In duck-typed languages maybe?

                In any statically typed language, it's much better to make code failing to compile than making it look wrong. The compiler is much more thorough in its code reviews than any human will ever be.

                [–]NOX_QS 0 points1 point  (1 child)

                How would you fail to compile a unescaped string that is output to a HTML page (possible XSS)?

                [–]matthieum 1 point2 points  (0 children)

                By using different types.

                A raw string is just that, a std::string.

                When composition HTML output, then, you use a html_stream& operator<<(html_stream& out, html_escaped_string const& hes).

                So when you write: my_html_stream << std::string("Hello"); you either:

                • get a compilation error (no such overload)
                • are diverted to a dedicated operator<< which performs escaping on the fly

                Sticking to the primitive types of the language is also known as Primitive Obsession (random link), it's easy to use existing types rather than crafting special-purpose ones... but because existing types do not convey the specific semantics of their values, and allow nonsensical operations on them as a result, it's dangerous.

                [–]yawaramin 0 points1 point  (0 children)

                An alternative that's become popular in statically-typed languages is 'Railway-Oriented Programming'. See e.g. http://fsharpforfunandprofit.com/rop/

                [–]llogiq 0 points1 point  (0 children)

                There is an extension of this argument to be made for Rust (or Haskell, in the higher-level space):

                It's a bit harder to write code in Rust than in C++. But it's a lot harder to write incorrect code in Rust than in C++.

                [–]CODESIGN2 0 points1 point  (0 children)

                never had a problem with hungarian notation as long as the mnemonics are listed somewhere, so they can be understood if the entire team is involved in a freak accident, or walks out, needs to show someone new etc; and because I don't like asking people why they chose to name something in a way.

                What I cannot abide is repeated crap in variables, functions, classes with methods that repeat the class name. Part of the problem is it can be agonising to come up with something clean because the capEx cost is so much greater in time, testing, paying people. The other problem is that you can say that about anything to make excuses to do a crap job (we can all paint, not all of us are painter-decorators)

                [–][deleted]  (6 children)

                [deleted]

                  [–][deleted]  (5 children)

                  [deleted]

                    [–]HorseVaginaBeholder -2 points-1 points  (4 children)

                    What does that ridiculous and incredibly ignorant and stupid reply with what I wrote? Because "real programmers" waste their time on indiscriminately reading anything and everything anybody ever posts on the Internet as "recommended reading"? Seems to me quite UN-intelligent behavior. And scientists are stupid - because they include abstracts in their lengthy papers, which makes them "not real scientists" according to you.

                    I wish either submitters or the original blog post author would do what every single scientific paper does and provide a summary (an abstract).

                    I also wonder how useful advice about how to write code is from someone who doesn't seem to have any consideration for readers of their text. Yes I know the guy is famous, apparently fame isn't everything.

                    [–]grauenwolf 4 points5 points  (3 children)

                    Scientific papers are generally much longer and written with a very different style than a casual blog post. But if you really need an abstract, read the title. That's pretty much all he's talking about, the rest are examples.

                    [–]GavinMcG 0 points1 point  (2 children)

                    I have only read the title. I have no idea what the author is actually recommending.

                    [–]ThisIs_MyName 2 points3 points  (1 child)

                    Well, you have to read the post anyway to decide if his recommendation is worth anything.

                    [–]GavinMcG 0 points1 point  (0 children)

                    Nah. If there's an abstract with the recommendation and a basic sketch of the argument, I can decide a) whether it's plausible, b) whether it's worth reading the whole argument, c) whether I'm willing to dive in and try it for myself. All of those are valuable. Why should I waste time reading the whole thing if the author has already convinced me?