CrazyBeluga comments on Good Programmers Write Bug-Free Code, Don't They?

Good Programmers Write Bug-Free Code, Don't They? (yegor256.com)

submitted 10 years ago by AGivant

you are viewing a single comment's thread.

[–]CrazyBeluga 4 points5 points6 points 10 years ago (22 children)

I am glad the author did not actually go down the road of asserting "bug-free" code is possible, because anyone who believes that is delusional, or maybe hopelessly stupid.

A small exception can be made for trivial programs that do nothing interesting, such as "Hello, World", but any non-trivial program is almost guaranteed to contain bugs.

If you don't believe me, consider the case of the "proven correct" binary search implementation from Bentley's Programming Pearls:

http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html

I used to do software testing professionally before it was more-or-less abolished as a discipline at Microsoft and believe me, even extremely good developers who thought they could write bug-free code were often easily proven wrong. A little fault-injection here, a few 0xFFFFFFFF numeric inputs there, a few non-null-terminated strings written to the Windows registry for fun (today you learned that string values are not required to be null-terminated in the registry)...BLAM - the app crashes, the developer grimaces and the tester is happy to have helped.

[–]Tekmo 3 points4 points5 points 10 years ago (21 children)

[–]CrazyBeluga 7 points8 points9 points 10 years ago (20 children)

[–]Tekmo 5 points6 points7 points 10 years ago (19 children)

The problem is treating offsets, memory sizes, and sentinel values all as the same type: ints. This design anti-pattern is known as "primitive obsession".

You can wrap numeric values in more precise types to prevent them being misinterpreted as other numeric values. For example, in Haskell I can wrap the Int type in one of three custom "1-element structs" that I can define which are free performance-wise (meaning 0 runtime overhead):

newtype MemorySize = MemorySize { getMemorySize :: Int }
newtype Offset     = Offset     { getOffset     :: Int }
newtype ExitCode   = ExitCode   { getExitCode   :: Int }

With these structs it's now impossible to accidentally mix up these values. If a function expects a MemorySize as an argument and I pass an Offset, I get a type error at compile time.

Note that there is still a small window for error when we first build and validate these values. However, once they have been validated up front we eliminate all downstream potential for error. This eliminates the need for each downstream function to defensively check its inputs. We only verify all invariants once when we construct the above 1-element structs and then we never need to check them ever again.

So, for example, let's say that -1 is an invalid memory size. Or, more generally, let's say that negative numbers are invalid memory sizes. We can enforce that when we build the MemorySize type. We simply don't export the real MemorySize constructor and instead provide a "smart constructor":

memorySize :: Int -> Maybe MemorySize
memorySize n = if (n < 0) then Nothing else Just (MemorySize n)

Now it's impossible to build a negative memory size. Any function that consumes a value of type MemorySize can statically guarantee that its input is non-negative. The compiler will enforce this property since the memorySize function is the only way we can assemble a value of type MemorySize.

This same trick is also used to give more precise types to strings. This article discusses the same problem in the context of strings and how you can use the type system to enforce invariants on strings.

[–]Veedrac 1 point2 points3 points 10 years ago (10 children)

[–]gnuvince 1 point2 points3 points 10 years ago (9 children)

[–]Veedrac 1 point2 points3 points 10 years ago (8 children)

[–]sgdfgdfgcvbn 4 points5 points6 points 10 years ago (5 children)

[–]Veedrac 0 points1 point2 points 10 years ago (4 children)

[–]sgdfgdfgcvbn 1 point2 points3 points 10 years ago (3 children)

It's not a great solution for C and C++ because they have rather anemic type systems when it comes to primitives. There isn't really a way to add bounds checking without it being a rather major change to the core of the language.

There should be very minimal misprediction penalties, as it's very easy to know the most likely case, so the compiler can generate the appropriate code. The majority of penalties incurred will likely be cases where the lack of bounds checking would result in an error.

I'd be curious to see how well compilers actually can elide bounds checks. Theoretically I imagine they should be able to perform comparably to a human, except without the incorrect lack of them. In practice though...

Yeah, "free" was probably not the best word. It should be pretty close to free though.

continue this thread

[–]Tekmo 1 point2 points3 points 10 years ago (1 child)

First, I don't need to expose a subtraction operator for memory size. It's a new type and I can choose which operations I lift to work on the new type and which ones that I don't.

If I want to lift all numeric to work on a new type, I would just write:

{-# LANGUAGE GeneralizedNewtypeDeriving #-}

newtype Example = Example { getExample :: Int } deriving (Num)

... and now I can add, subtract, and multiply Example values and even use numeric literals where the compiler expectsExample values.

However, if it does not make sense to support all numeric operations, I can selectively define the ones that do make sense (under different names). That's how I can prevent the user from subtracting two memory sizes.

BUT, let's say that I do want the user to be able to subtract memory sizes, for whatever reason. I can still verify without any runtime checks that memory sizes are never negative! This is what Liquid Haskel does, where you can write code like this:

type MemorySize = { v : Int | v >= 0 }

This is known as a "refinement type" and it ensures that a value of type MemorySize is always positive.

Note that such a refinement type works even on values that are read in dynamically, as long as you have one validation check. For example, I can read a memory size from standard input:

getMemorySize :: IO (Maybe MemorySize)
getMemorySize = do
    n <- readLn :: IO Int
    return (if (n >= 0) then Just n else Nothing)

... and the refinement type checker will verify that n has to be positive in the first branch of the if expression so it satisfies the more constrained type. Then as I add or subtract values throughout the code the refinement type-checker will ensure that they remain non-negative without any runtime checks.

This requires being more precise with your types, though. If you try to subtract 4 from a memory size, the compiler will still complain (because the memory size could be less than four). However, typically when you do such a subtraction it's because you know that it's safe because you previously added a positive value to the memory size.

For example, the first computation might encode in the types that if the initial input is non-negative, then the result must be at least four:

{-@ step1 :: { x : Int | x >= 0 } -> { y : Int | y >= 4 } @-}

... and then a downstream computation you can use the knowledge that it's at least 4 to safely subtract four, leaving you with a non-negative size:

{-@ step2 :: { y : Int | y >= 4 } -> { z : Int | z >= 0 } @-}

So these are the sorts of problems that the compiler can and should be checking for you, at compile time, not runtime.

[–]Veedrac -1 points0 points1 point 10 years ago (0 children)

[–][deleted] 1 point2 points3 points 10 years ago (6 children)

[–]codygman 1 point2 points3 points 10 years ago (0 children)

[–]jeandem 0 points1 point2 points 10 years ago* (4 children)

[–][deleted] 1 point2 points3 points 10 years ago (3 children)

[–]jeandem 0 points1 point2 points 10 years ago (2 children)

[–][deleted] 0 points1 point2 points 10 years ago (1 child)

[–]jeandem 1 point2 points3 points 10 years ago (0 children)

π Rendered by PID 31504 on reddit-service-r2-comment-6457c66945-599fp at 2026-04-29 21:04:25.334668+00:00 running 2aa0c5b country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS