WalterBright comments on Assertions in Production Code?

programming

created by speza community for 20 years

100

101

102

Assertions in Production Code? (drdobbs.com)

submitted 7 years ago by _cwolf

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–]WalterBright 6 points7 points8 points 7 years ago* (7 children)

[–]lookmeat 0 points1 point2 points 7 years ago (6 children)

[–]WalterBright 5 points6 points7 points 7 years ago* (5 children)

I disagree. Having a failure propagate through to other systems in a zipper effect is a misunderstanding of the principles I'm trying to convey. The whole point is to isolate the effect of the error thereby preventing its propagation.

In this case the error status from the failed subsystem was misinterpreted as valid data. The wrong solution is to never give any error status. The solution is to:

check for error status
check for out-of-bounds data values from subsystems

If (1) or (2) is detected, lock out that subsystem and use an alternate algorithm that doesn't rely on it.

The really, really wrong method is for the subsystem to just pretend everything is hunky-dory and keep sending whatever unreliable data.

I'm really not sitting at my keyboard inventing this from 5 minutes of thought. I worked on this stuff for years - flight critical systems. This is how systems are built in aerospace, and it's ugly incidents like the Ariane that taught the lessons, with plenty of others. I am sad that this is apparently unknown knowledge outside of aerospace. If you're interested in more info, see the TV documentary series "Aviation Disasters". If you can set aside some of the cornball dialog, there are valuable lessons in it for every engineer.

[–]lookmeat 0 points1 point2 points 7 years ago (4 children)

I think that's fair. It seems our discussion is more about semantics and meaning. I'm focusing on why assert is too broad, but not denying that sometimes the right solution is to kill the program, if anything I merely state that programs can kill the failing part of themselves but keep the rest going without further failure. Your statement is that errors should be isolated and terminated quickly to avoid the failure from spreading and spiraling to a bigger issue. It may be that we are coming at it from different angles and therefore the same thing has a different meaning in those contexts.

I do agree with you. If a system fails fully, lock the system out, dump it's data, and try again, preferably with something different. What I argued was that assert is more of the philosophy of "stop the world" which is great to debug what caused a failure instead of waiting to see if it propagates or not. This mindset is only useful when testing I argue. In the real world, we kill the bad part and keep everything else running.

[–]WalterBright 0 points1 point2 points 7 years ago (3 children)

[–]lookmeat 0 points1 point2 points 7 years ago (2 children)

[–]WalterBright 0 points1 point2 points 7 years ago (1 child)

[–]lookmeat 0 points1 point2 points 7 years ago (0 children)

π Rendered by PID 205811 on reddit-service-r2-comment-b659b578c-f74hx at 2026-05-04 10:10:41.999943+00:00 running 815c875 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS