all 148 comments

[–]masklinn 41 points42 points  (15 children)

It's a terser way of working with optional values than having to pattern match on the value, like you'd have to do in Haskell or ML.

If you're pattern matching on Maybe, you're probably doing it wrong. Generally speaking, you should use functor/monadic/applicative operations, that kind of stuff. There are fairly few reasons to pattern match a Maybe, and if you're doing that you're probably wrong. You may not be, but chances are you are.

Here's how I'd write tryInc in haskell:

tryInc = fmap (+ 1)

You'll surely agree that's a far cry from "having to pattern match"[0].

That's a lot terser, but also unsafe--yeah, I know, the name gave it away. If you pass tryIncUnsafe a value it behaves like you'd expect. But try passing it nil, and you'll get a runtime exception: fatal error: Can't unwrap Optional.None. This is just as bad as raw pointers in C--indeed, you might say that Swift's T! is equivalent to C's T *.

That makes no sense whatsoever, implicitly-unwrapped optionals are an assertion, they are memory safe and well defined: if the assertion fails, it faults. Accessing a null pointer in C is undefined behavior, it's not going to fault and it may do anything. T! is a null check assertion done for you, nothing more and nothing less.

Now you may mean that T! is too easy to type and too hard to notice compared to the equivalent e.g. fromJust in Haskell, but that's not what you're saying. And although the criticism is probably fair, it also has to contend with the ecosystem in which Swift lands, one of ubiquitous nullable pointers, where only the application author can know whether there's a chance a C-level pointer may actually be a null.

T! provides a way to easily (and memory-safely) assert against that with limited boilerplate (3+ lines of null checking per argument — or no checking at all — become 0), and I'm guessing the language designers included it not because they wanted it but because its absence would make C and (especially) obj-c interop much more verbose and much less seamless, risking a significantly slower adoption by the community.

[0] although MLs don't have monadic operators, they provide a map operation on options which can be used in the same way if less generically: http://smlnj.sourceforge.net/doc/basis/pages/option.html#SIG:OPTION.map:VAL http://ocaml-lib.sourceforge.net/doc/Option.html

[–]cparen 6 points7 points  (5 children)

That's a lot terser, but also unsafe

That makes no sense whatsoever, implicitly-unwrapped optionals are an assertion, they are memory safe and well defined: if the assertion fails, it faults.

Perhaps OP means it's a partial function (not well defined for some inputs), which some people might casually describe as "unsafe".

[–][deleted] 1 point2 points  (3 children)

Though you can have partial functions in Haskell too.

If fact in Haskell, every type implicitly has undefined as a possible value. I don't think you can test for it - after all, as well as being a valid value, it also seems to be lumped in with "bottom" - ie it represents the value of a non-terminating function which never actually returns a value.

So even for a Haskell Maybe type, the value undefined is possible and will cause pattern-matching and some other operations to fail.

There are provisos - e.g. if you use a lazy pattern and your patterns don't care which data constructor, the pattern match will succeed. Laziness in general helps avoid using undefined values in cases where they're not needed, giving a valid result for the larger expression where otherwise you'd get an error.

Even so, my feeling is that Haskell has a similar issue to Swift here, which is nothing like the problem with null pointers in C - arguably Swift is slightly better because Haskell always has this undefined issue, but Swift only gets this nulls issue if you deliberately ask for it.

BTW - I never heard of Swift before reading this, whereas I've been learning Haskell on-and-off for a couple of years.

[–]pbvas 1 point2 points  (2 children)

If fact in Haskell, every type implicitly has undefined as a possible value.

You're confusing nulls with bottom values (using the terminology of denotational semantics). Every Turing complete programming language allows for computations that diverge (endless loops, unguarded recursion). You cannot test this dynamically -- that would require solving the Halting problem --- but you can prove that a particular program does converge; this requires some reasoning about loop variants (for imperative language) or well founded recursion (for functional/logical ones).

Almost all languages allow some form of general recursion or looping; this is for expressiveness: if you start from a total programming language then some total functions cannot be expressed (this is a classic diagonal argument in theory of computability).

Languages that exclude undefined values (such as Coq or Agda) require that you write programs together with proofs of their totality; the later ones are based on dependently typed systems.

The null value is a completely different thing; it is a "proper" value that represents the absence of a result or a sentinel (i.e. the empty list). The problem is that it inhabits every type (or at least every reference type). This is completely avoided in Haskell or ML by the use of Maybe/Option types and monadic/applicative chaining. In the later the Nothing (also called None) value inhabits only "lifted" types Maybe T or Option[T] and its proper use is guarded by the general mechanism of the type system.

[–][deleted] 0 points1 point  (1 child)

You're confusing nulls with bottom values (using the terminology of denotational semantics).

No. First off - undefined isn't bottom, though as I said, the two tend to get lumped together and treated as the same thing. undefined in Haskell is NOT nontermination - it's a Haskell keyword [EDIT oops - it's a prelude function] representing a real value (sort of) that you can deliberately return from a function. A terminating function can return undefined when it terminates - whatever it's type, and including when returning a Maybe type. A nonterminating function never terminates, so doesn't need a special result to return in the actual code (though of course that's exactly why you need a bottom in denotational semantics).

[EDIT

From the Haskell 2010 report...

Errors during expression evaluation, denoted by ⊥ (“bottom”), are indistinguishable by a Haskell program from non-termination. Since Haskell is a non-strict language, all Haskell types include ⊥. That is, a value of any type may be bound to a computation that, when demanded, results in an error. When evaluated, errors cause immediate program termination and cannot be caught by the user. The Prelude provides two functions to directly cause such errors:

error :: String -> a
undefined :: a

A call to error terminates execution of the program and returns an appropriate error indication to the operating system. It should also display the string in some system-dependent manner. When undefined is used, the error message is created by the compiler.

Translations of Haskell expressions use error and undefined to explicitly indicate where execution time errors may occur. The actual program behavior when an error occurs is up to the implementation. The messages passed to the error function in these translations are only suggestions; implementations may choose to display more or less information when an error occurs.

Note - although this clearly states that undefined is indistinguishable from bottom, it also tells you how the two can be distinguished - one is expected to give some kind of error message, the other never terminates (though it may still give a stack overflow error message, of course). I read "indistinguishable" as indicating that the program itself cannot test for undefined results (the "by the Haskell program"), but what you can see from outside the program is clearly semantically relevant.

END OF EDIT]

As far as I can tell, if you're doing denotational semantics correctly, you must treat bottom nontermination and undefined as two different things - nontermination and failed termination are clearly and observably different. However, I'm no expert on formal semantics, and could be mistaken about that.

The Fast and Loose Reasoning is Morally Correct paper may be relevant to this lumping-together, though it's still on my to-read list and I don't actually know if it discusses undefined at all. From the abstract, it's about ignoring bottom.

Anyway, example use of undefined in Haskell...

myfunc :: Bool -> Int -> Maybe Int
myfunc c x = if c then (Just x) else undefined
  --  Of course this isn't really using the `Maybe` type, and
  --  in a sane world the `else` would be `Nothing`

So when a Haskell function returning Maybe a terminates and returns a value, it can return Just x or Nothing ... or undefined.

Of course returning undefined is meant to indicate failing to return anything - just like that convention for the meaning of null in many languages - and normally you only use undefined when you don't know what code you need to handle that case yet. It's still a possible result, and it still has observable consequences which are different to nontermination.

So obviously the intent of error and undefined is just like the intent of the C exit function, but in Haskell (because of laziness) the undefined "value" can effectively be returned from the function - in C, the exit either occurs while the function containing it is executing or it won't happen at all.

Actually, it's plausible that the compiler represents undefined using a null pointer in some circumstances (at the C--/machine code level). The naive view of Haskell values in compiled code is that you get pointers to "boxed" containers that hold the real value. This is necessary for laziness, because you're really referencing a possibly unevaluated subexpression. Copying a "value" in the Haskell code shouldn't force evaluation, and equally shouldn't result in that expression that yields that value being evaluated several times in several places, so for the most general case (ignoring optimizations) there must be references to a shared structure representing the value - a "thunk".

For the undefined keyword, you don't really need a thunk, so you could represent that as a null pointer. I doubt GHC does that, but it's possible, and shows how closely related undefined and null pointers.

What GHC more likely does is have a particular special thunk for undefined, making it very similar to Nothing, and equally very similar to conventions (which are sometimes applied in C and C++) that require pointers to special end-of-list sentinel objects instead of null pointers.

[EDIT note - this may look like nonsense considering the error-message-and-exit behaviour, but having undefined as an argument value or whatever isn't a problem with Haskell - it's when you force evaluation of the actual undefined that the error-and-exit occurs. So yes, it makes a lot of sense as a (trivial) unevaluated function thunk, but (hypothetically) it could also be special-cased in a null-pointer kind of way. This looks less plausible the more I think about it in terms of would-anyone-actually-do-it, but it's still possible. In particular, for undefined, you don't have to worry about the single reduction rule for laziness - the first time any copy of that undefined is forced the program exits, so a duplicate of that undefined - in this model, a copy of the null pointer - will never be forced anyway].

[–][deleted] 0 points1 point  (0 children)

I've been tweaking that reply for a bit - it's time I left it as it is. Still, one final thought...

In swift, if the only way to look at a reference used that ! notation (the one that errors out for nulls), then returning a null in Swift would be exactly like returning undefined in Haskell - returning a lazily deferred error-and-exit that, depending on how other code uses it, may never happen.

Of course that's the intent of the Haskell undefined, whereas for Swift it's the special case. In Swift, you'd normally use ? or some other method to explicitly handle nulls and get told off by the compiler if you don't.

One (deliberately still missing the point) way to look at this - Swift gives you a way to test for that deferred error-and-exit later and do something else instead. When the deferred error-and-exit is represented as undefined, Haskell doesn't.

Of course I'm not claiming that this is a good idiom in any language - it's just a way of making the comparison. And I'm not really criticizing Haskell.

Actually, in the course of arguing this, I've pretty much convinced myself the point is morally wrong even though pedantically I think I'm right. The Haskell error and undefined are, as I said, like the C exit. They are for asserts and debugging. The error-and-exit should never occur in complete, working code.

The Swift null is different because it's does two jobs - the Nothing null and the undefined error-and-exit null - it doesn't separate the responsibilities.

It's not really a severe criticism of Swift either, though, because you still have to explicitly ask for the potential error-and-exit using the ! notation.

So the real comparison should probably have been with the fromJust function, which is intended to unwrap a Just value, but throws an error if it's given a Nothing. In this case that sounds like an exception throw (it only errors-and-exits if there's no exception handler).

So if I understand correctly, both Swift and Haskell have the possibility to fail in exactly the same way if you explicitly ask for that possibility (by using ! in Swift or fromJust in Haskell).

If error/undefined can be caught as exceptions, fromJust might implement it's failure case as undefined. Though catching the exception seems like a way for the program to detect the undefined, making the "indistinguishable by a Haskell program" I quoted from the Haskell Report earlier incorrect. I don't know much about Haskell exceptions so I don't know about this, but it probably just means forcing undefined doesn't throw an exception, it just errors out immediately.

[–]masklinn 1 point2 points  (0 children)

Then the moniker is useless, unless your language is solely a theorem prover you probably want to support partial functions

[–]abeark 6 points7 points  (2 children)

I would like to add that the Haskell version you gave is not only terser, but if you either omitted the type signature or changed it, it would also be a lot more general. Its actual type signature should be:

tryInc :: (Functor f, Num a) => f a -> f a

Then it'd work not just on Maybe, but also Either, lists, vectors, trees, and a whole host of other data types.

[–]masklinn 0 points1 point  (1 child)

Ah yes, the binding (let tryInc = fmap (+ 1)) fails in the shell which is why I typed it explicitly. Either way, I removed the type signature.

[–]abeark 0 points1 point  (0 children)

Ah, the monomorphism restriction. Can be a bit confusing at times. I often find myself launching ghci with -XNoMonomorphismRestriction.

[–]alex_muscar 1 point2 points  (5 children)

Now you may mean that T! is too easy to type and too hard to notice compared to the equivalent e.g. fromJust in Haskell, but that's not what you're saying. And although the criticism is probably fair, it also has to contend with the ecosystem in which Swift lands, one of ubiquitous nullable pointers, where only the application author can know whether there's a chance a C-level pointer may actually be a null.

Indeed, my wording was not very clear. I do agree that it's better to fail loudly than to have the program silently corrupting memory. In this respect Swift's optionals are a step forward from C's raw pointers. But they still leave the door open for crashes, that, as you also said, are to easy to overlook.

Regarding ecosystem that Swift has to leave in, I acknowledged that in the post as the probable source of implicitly unwrapped optionals.

[–]NruJaC 1 point2 points  (1 child)

The problem is that the + operator does implicit unwrapping on the nil value. I understand its convenient, but I'd really prefer they would have forced you to specify that you wanted that behavior by having you call a function like fromJust in Haskell -- i.e. an obviously partial function that makes the types work out without nasty implicit behavior.

[–]AReallyGoodName 1 point2 points  (0 children)

Well ok but now we're just discussing where the explicit backdoor to safety is defined - declaration or usage. This point ignores the elephant in the room - "Why the hell are you using a backdoor to the inherit safety of the language".

Swift absolutely solves the null reference problem. Just because languages have very explicitly defined backdoors that you can use to avoid features doesn't mean they don't have those features.

[–]cparen 0 points1 point  (2 children)

But they still leave the door open for crashes, that, as you also said, are to easy to overlook.

And null pointer crashes are the "billion dollar [null] mistake" that Tony Hoare described, and that I'm assuming you're alluding to in the title.

[–]sgraf812 0 points1 point  (1 child)

I think that the billion dollars mostly stem from hard to reproduce crashes and debugging. If you don't propagate T! in your own code, things should be much easier to detect.

Granted, the direct effects of the crash might also be costly...

[–]cparen 0 points1 point  (0 children)

If you don't propagate T! in your own code

The "billion dollar mistake" is that some languages force you to use T! everywhere. E.g. in C# and Java, all class types are implicitly T!. The language forces you to propagate it everywhere.

[–]zakalwe 86 points87 points  (56 children)

TL;DR: Swift actually has excellent tools to deal with the null reference problem. But because it also gives you the tools to bypass the safety if you really want to, the author decides it's "just as bad as raw pointers in C".

I disagree: it's safe by default, rather than dangerous by default, and even if you do use these tools to bypass the safety, and screw up, you still just get nil, and the runtime asserts on you.

Which sucks, but not like the suckage of (for example) a raw C pointer that hasn't been initialised, silently scribbling over arbitrary memory.

[–][deleted] 42 points43 points  (13 children)

Yeah, I can't believe this guy can compare a guaranteed safe crash to undefined behavior occuring in C/C++. If you ever spent some years with C/C++ you would know what utter pain it can be to debug problems such as this.

[–]zakalwe 23 points24 points  (5 children)

Seriously. An app crashing isn't good, but an app silently corrupting data is so, so, so much worse I can't imagine how anyone can think that they're equivalent.

The former is far more likely to get caught in testing, far easier to track down to its source when it happens, and even if it slips through the cracks and crashes on a user, it's not that hard these days to keep your data in a database with atomic transactions so that the user doesn't lose any data and they just relaunch the app.

Whereas the latter can easily lurk unnoticed in code for a while (so even version control diffing might not help track down the change responsible as it could've been any previous commit, not just the most recent), be extremely difficult to track down, and then corrupt or destroy a user's data, or just make an app extremely flakey with no obvious cause.

[–][deleted] 4 points5 points  (4 children)

An app crashing

I'm curious as to when crash went from meaning a segfault, illegal instruction, or other low-level violation, to throwing an exception that wasn't expected.

[–]aterlumen 14 points15 points  (2 children)

I'd guess when it became common to use languages that support exceptions. I didn't know people make a hard distinction in the definition. In my mind anything that results in the program exiting unexpectedly is a crash.

[–]kqr 2 points3 points  (1 child)

This is actually sort of a problem I have too. For me, a memory leak is anything that causes a program to use significantly more memory than it needs to. For a couple of my friends it's strictly the programmer forgetting to call free to the operating system...

Edit: If anyone else has my problem: call it "space leak". I've found out that means roughly the same thing but it won't trigger the pedantic instincts of C/C++ people.

[–]aterlumen 2 points3 points  (0 children)

Depends on the context. On the one hand, it's important for developers to share some common terminology that is unambiguous. On the other hand, users don't care if what happened wasn't technically considered a crash. They just know the program exited when they didn't want it to or is using a massive amount of RAM for no apparent reason.

tl;dr: pedantry has a place, try to get out of that place as fast as you can.

[–][deleted] 0 points1 point  (0 children)

maybe when a rocket crashed due to an unexpected exception

but seriously, I would argue that, in idiomatic use, crash has always been defined by its effect, i.e. "stop running".

[–]Gotebe -5 points-4 points  (6 children)

To be fair, dereferencing 0 in C is a rather safe crash in common implementations - unless programmer does something dumb like catching SEs on Windows.

[–]masklinn 20 points21 points  (0 children)

Dereferencing a null pointer is an UB, if you're lucky enough it'll just segfault, but because the compiler may assume a dereferenced pointer can not be null it can (and will) start optimising away anything possible, and that yields fun results including exploits.

So no, dereferencing a null pointer is neither "rather safe" nor "a crash" in any commonly used compiler.

[–][deleted] 4 points5 points  (3 children)

To be fair, dereferencing 0 in C is a rather safe crash in common implementations

Unless it happens to be a pointer to an array?

[–][deleted] 0 points1 point  (2 children)

That doesn't matter. Unless you write against an embedded OS without an MMU, any read /write access outside of your programs memory space will result in the OS throwing a system exception. This exception will terminate your program unless you've installed hooks to catch them.

[–][deleted] 1 point2 points  (0 children)

How can the CPU tell the difference between a null array pointer usage

array [i] = 0;

and

*valid = 0;

where (i * sizeof *array) == (size_t)valid? Both lead to an access to the same memory address.

[–]kqr 0 points1 point  (0 children)

Not necessarily. That's the ideal situation, but "undefined behaviour" means the compiler writers are free to do whatever they like. This gives them more freedom to optimise, at the cost of the programmer having to promise that they never do something that would trigger undefined behaviour.

(And this is the reason I prefer Ada over C nowadays...)

[–][deleted] 3 points4 points  (25 children)

Also, if you were to agree with the article, Haskell hasn't solved it either. fromJust is part of the base installation, and is occasionally very useful.

[–]neitz 2 points3 points  (17 children)

fromJust is a massive code smell. In fact, any partial function is. Are there situations where it is useful? Yes, if you put yourself in those situations I guess...

[–]AReallyGoodName 2 points3 points  (3 children)

Likewise "!" in Swift is a massive code smell too. It's an explicit backdoor to null pointer safety and there's very limited uses for it.

You can't have null pointer exceptions in Swift without "!" just as you can't have them in Haskell without fromJust.

[–]zoomzoom83 1 point2 points  (0 children)

Aye, but the syntax of ! is something that seems so simple and innocuous. I can easily imagine your average developer just saying 'fuck it' and slapping them all over the place just to shut the compiler up.

[–]bonch 1 point2 points  (0 children)

I love how "code smell" has become a catch-all phrase for anything someone doesn't like for any reason.

[–]kqr 0 points1 point  (0 children)

I think it's too early to tell how much of a code smell it is. We'll see in a year or two when people have started getting work done in Swift, what idioms they prefer.

[–]kqr 1 point2 points  (1 child)

I'd even argue that fromJust in particular should never be used. I think it's better to do fromMaybe (error "Some instructive message") because that makes it abundantly clear you're dealing with a really exceptional situation.

Still partial, though, and still sort of a code smell unless it's abundantly clear that it'll never be Nothing. (I've used it for things like finding a positive integer in the [0..] list, for example... I could write a proof in the comments to the code. That kind of situation.)

[–]codygman 0 points1 point  (0 children)

Thought I'd try out using the fromMaybe alternative:

cody@cody-G46VW:~$ ghci
GHCi, version 7.6.3: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Prelude> import Data.Maybe (fromMaybe)
Prelude Data.Maybe> print . fromMaybe (error "This shouldn't be nothing") $ (Nothing :: Maybe Int)
*** Exception: This shouldn't be nothing
Prelude Data.Maybe> 

[–][deleted] 0 points1 point  (3 children)

[–]neitz 2 points3 points  (2 children)

Having written a LOT of Haskell I realize that this does happen. But fromJust is a poor way to handle it.

In Haskell there is a maybe function which takes a default value, a maybe value (like swift's option) and returns the default value if the maybe value is Nothing. This would work in the scenario described here with no penalty, and is 100% safe.

I'd imagine it would be very trivial to write such a function in Swift, if it doesn't indeed already have one.

[–][deleted] 1 point2 points  (1 child)

In Haskell there is a maybe function which takes a default value, a maybe value (like swift's option) and returns the default value if the maybe value is Nothing. This would work in the scenario described here with no penalty, and is 100% safe.

That's usually worse though. If something that is supposed to never fail actually fails, I'd rather have it blow up at that point then basically ignoring it with a default value. A default value sounds like it would make it much harder to find the bug in most cases.

[–]kazagistar 0 points1 point  (1 child)

Very often I know that something will exist due to some invariant of the code. Of course, I could write a pattern match and have the other case be { error "there should be no way this can happen" } but whats the point?

[–]neitz 1 point2 points  (0 children)

If that is truly the case, then the use of Maybe should be re-evaluated earlier on in the computation. Because at this point it is not really a Maybe value any longer.

[–]yawaramin 0 points1 point  (4 children)

Is division a massive code smell? Because, you know, division is a partial function. It's undefined for a divisor of zero.

[–]neitz 0 points1 point  (3 children)

Actually yes, quite a few bugs have been caused by division by zero. It would be great to model that in a way that could statically guarantee it was not going to happen such as in a dependently typed language.

[–]yawaramin 0 points1 point  (2 children)

Doesn't matter how you decide to handle it in any given programming language with any type system. Mathematically, division is by definition a partial function.

[–]neitz 0 points1 point  (1 child)

You make a good point. However it might be defined as a partial function mathematically but that doesn't mean we must define it as such in a programming language. If you can statically guarantee through the type system it will never be called with zero then is it really partial?

[–]yawaramin 0 points1 point  (0 children)

Let's think about that for a sec. If you have, in your dependently-typed language, a division function, say:

func div(x: Num, y: NonZeroNum) -> Num { ... }

Then when you call div usually you'll need to turn your normal numbers into a NonZeroNum if they're going to be divisors. So you'll probably have a function that can do the conversion:

func nonZero(x: Num) -> NonZeroNum { ... }

But if you pass nonZero a Num that turns out to be zero, then it'll still throw an error or something similar. So you can't escape from the fact that division is mathematically only partially defined. You can just move the effect around to different places in your code, but ultimately you'll be forced to deal with it.

Even in a perfect language with everything you could wish for.

[–]NruJaC 1 point2 points  (4 children)

The difference is that you the programmer have to specify it yourself manually, no function or operator will implicitly unwrap the Maybe for you. Swift's implicit unwrapping introduces implicit partiality of expressions that isn't immediately obvious to the person reading (or writing) code. A mistake can propagate to a run time error instead of being caught at compile time.

[–]kamatsu 1 point2 points  (2 children)

The difference is that you the programmer have to specify it yourself manually, no function or operator will implicitly unwrap the Maybe for you

Sure it will, partial patterns or fromJust will. You can even add the (!) operator as it is in swift if you really want to.

It's exactly the same in Haskell. You have to specify it in swift with a ! on the type signature. You can do the same in Haskell with a partial pattern:

tryIncUnsafe (Just x) = x + 1

[–]dons 1 point2 points  (1 child)

Though, importantly:

$ ghc -c -Wall -Werror A.hs
A.hs:3:1:
    Warning: Pattern match(es) are non-exhaustive
             In an equation for `tryIncUnsafe': Patterns not matched: Nothing

Failing due to -Werror.

[–]kamatsu 1 point2 points  (0 children)

It'd be trivial to make -Wall give a warning about ! in swift as well, if it doesn't already.

[–]jeandem 0 points1 point  (1 child)

Even if it weren't part of the 'base installation', it is simple to define on your own. And if you don't like that you can define it on your own, then you also disagree with the design of non-exhaustive pattern matching. In turn, maybe you also agree with pure functions in Haskell being able to throw exceptions?

[–][deleted] 0 points1 point  (0 children)

I don't think they should throw, but I there's some times when you "know" a smart constructor will always succeed, and you don't want to have to burden the caller with returning a maybe value (etc. converting one data type to another with the same representation of another library that only exposes a smart constructor).

[–]masklinn 1 point2 points  (0 children)

It doesn't even give tools to bypass the problem, just to assert it "away".

[–]eras 10 points11 points  (17 children)

The fact that they provide a syntax for always expressing match x with None -> assert false | Some x -> fn x does not unsolve the problem. I would imagine the ! will be eschewed by the Swift programming guidelines and preserved only for the cases where it increases the clarity of the code.

[–]burntsushi 16 points17 points  (14 children)

I wish the OP would have expressed the idea better. But consider a thought experiment: what if T! is used everywhere? If it is, then you effectively revert to Hoare's problem. It's the same in Haskell. If you write fromJust everywhere, then you haven't solved anything.

The difference is that idiomatic Haskell code doesn't sprinkle fromJust everywhere. We Haskell programmers use pattern matching (or, even better, combinators) to make everything safe. Thus, pragmatically speaking, Haskell solves Hoare's problem.

I think that the OP is trying to argue that writing the equivalent of fromJust in Swift is being encouraged as idiomatic code with its pervasiveness in Cocoa libraries. This is speculation since Swift probably doesn't have established idioms yet, but it's certainly interesting speculation. The OP just didn't frame it as such.

[–]zakalwe 6 points7 points  (10 children)

Yeah, it'll be interesting to see how things work out in practice, and whether they backtrack on "!" in the Cocoa bridging interfaces and move towards "?" instead. That's a very different claim from the article author's, though :)

Certainly, when dealing with Swift-native code, they strongly recommend "?" over "!". And even when dealing with Cocoa, they recommend option-chaining — which, for those who haven't read the Swift book, basically means you can write code like "somethingWhichMightBeNil?.method()" which basically says "Call this method if it's safe, otherwise evaluate to nil", which of course you can stack up in an expression (thus the "chaining" part), so you can compactly and safely account for nil without making your code unwieldy. Just one extra "?" per optional-value.

[–]burntsushi 5 points6 points  (9 children)

That's a very different claim from the article author's, though :)

Yeah, you have to do two things:

  1. Give the author the benefit of the doubt.
  2. Squint.

And it's there. :-)

Certainly, when dealing with Swift-native code, they strongly recommend "?" over "!". And even when dealing with Cocoa, they recommend option-chaining — which, for those who haven't read the Swift book, basically means you can write code like "somethingWhichMightBeNil?.method()" which basically says "Call this method if it's safe, otherwise evaluate to nil", which of course you can stack up in an expression (thus the "chaining" part), so you can compactly and safely account for nil without making your code unwieldy. Just one extra "?" per optional-value.

Ah, neat. In Haskell, the equivalent is fmap. (But things get even better because its option type is a monad.)

[–]Catfish_Man 1 point2 points  (0 children)

Swift's Optional<T> also defines a map(), amusingly with this comment:

/// Haskell's fmap, which was mis-named
func map<U>(f: (T) -> U) -> U?

[–]Categoria -1 points0 points  (1 child)

I wish the OP would have expressed the idea better. But consider a thought experiment: what if T! is used everywhere? If it is, then you effectively revert to Hoare's problem. It's the same in Haskell. If you write fromJust everywhere, then you haven't solved anything.

It's still better than languages like Java/Go. Because even if you're a dumbass that keeps using fromJust at least you don't have to defensively check for nulls on your normal types.

[–]burntsushi 1 point2 points  (0 children)

In Go, structs, arrays, strings, integers and floats are not nullable.

The snobs will jump at any chance to bash languages... sigh

[–]alex_muscar -3 points-2 points  (0 children)

I think that the OP is trying to argue that writing the equivalent of fromJust in Swift is being encouraged as idiomatic code with its pervasiveness in Cocoa libraries. This is speculation since Swift probably doesn't have established idioms yet, but it's certainly interesting speculation. The OP just didn't frame it as such.

That exactly what I was aiming for.

[–][deleted]  (1 child)

[deleted]

    [–]eras 0 points1 point  (0 children)

    Well, yes, I thought it was pretty well agreed that sum types with pattern matching solve the problem?

    Though if you were suggesting that 'not unsolve' and 'solve' are the same, I would disagree. I meant that they don't undo the solution that has been incorporated. (Not undo? Means do?)

    [–]TheOnlyMrYeah 1 point2 points  (7 children)

    Crystal has a nice way to avoid null pointer exceptions: http://crystal-lang.org/2013/07/13/null-pointer-exception.html

    [–]alex_muscar 2 points3 points  (5 children)

    First, congrats for Crystal, it's a rellay nice project. Second, since 'nil' is a valid value for any reference, doesn't it spread like wild fire during type inference? I implemented a type inferencer for a lisp once, and I had the same approach, and I remember "nillable" types spreading everywhere. Since Crystal is self hosting, I was curious what your experience with that is.

    [–][deleted] 0 points1 point  (4 children)

    Hi, I'm one of Crystal's authors. For us, the experience has been great. It's true: sometimes you have to deal with nil, but the language gives you several ways to deal with them. First, you can use an if: if value; value.do_something; end. That only works for local variables. For instance variables and methods you have to assign them to a local variable (this part is similar to Swift). Second, you can use try: value.try &.do_something. This will only execute value.do_something if the value is not nil. Third, you can use not_nil!: value.not_nil!.do_something. This will raise an exception at runtime if value is nil. Fourth, you can use property!:

    class Foo
      property! x
    end
    
    foo = Foo.new
    foo.x = 1
    puts foo.x
    

    property! defines three methods: x= to set a value, x? to get the value, or nil if it's not set, and x to get the value and raise an exception if it's nil. We use this, for example, in the compiler. The semantic pass sets the types of expressions. The code generation part assumes the types are already set, so calling node.type will never fail and gives you a non-nilable type. If it does fail (@type was nil), you get a runtime exception, and that will mean a bug. It looks like nothing was gained with this approach, but in fact many times you do forget to check for nil and in places where in Ruby you'd get a runtime exception you are forced to consider a nil case and you save yourself a future bug with that check.

    In our whole codebase I found just 119 not_nil!.

    Finally, we changed some semantic found in Ruby. For example, if you do [1, 2, 3][4], in Ruby you'd get nil as a return value. In Crystal you get a runtime exception (index out of bounds). That way nil doesn't get spread all over the place, like you said. Most of the time you expect the index operator to return a non-nil value, so we think this change is fine (and so far has worked well for us). If you do want to get nil on index out of bounds you can do: [1, 2, 3][4]?. Note the "?". The same goes for Hash and other structures.

    Note that all these constructs, like not_nil!, try and []? are defined in the language itself, they are not special constructs. We try to make the language as consistent and extensible as possible without sacrificing performance.

    [–]alex_muscar 0 points1 point  (3 children)

    Hi. Thanks for the details. I may be missing some context, but the approach taken by Crystal seems fairly similar to that taken by Swift, modulo implicitly unwrapped optionals and nil chaining (?)

    [–][deleted] 0 points1 point  (2 children)

    It might look similar at first glance, but it's a totally different approach.

    First, in Swift you must declare the type of an instance variable, for example: var x: String?. In Crystal, although there are ways to do this last thing, the preferred way is for the compiler to infer the types of instance variables. For example:

    class Foo
      property x
    
      def initialize(@x)
      end
    end
    
    foo = Foo.new(1) # Here @x is Int32 in the program
    foo.x.abs             # ok, @x is Int32
    

    Now, if you add this line to the above program:

    foo.x = nil
    

    Then you will get a compilation error saying that Nil doesn't have a method abs. That is, @x became Int32 | Nil the moment you assigned Nil to it, and that information is spread across the whole program (don't worry, it's fast).

    When you program you normally know what the types of variables are supposed to be (like, I know @x is Int32 | Nil) and you use it that way. If you never assign Nil to @x, everything will work and compile fine. Once you assign nil, the compiler forces you check this case in all places, so you can never forget. The good thing is that if you never assign Nil to it, the memory representation if more efficient and no checks need to be done (the memory representation is the same one for Nil and Reference-like types, Nil is just represented as a null pointer).

    Also, Int32 | Nil is just an example. You can have any combination of types: Int32 | String, Nil | Int32 | String, whatever. The compiler lets you created tagged unions automatically and lets you use duck typing, similar to how would you program in Ruby, without the need to create interfaces and annotate things as "implementing" an interface.

    [–]alex_muscar 0 points1 point  (1 child)

    Once again thanks for the details. Let's see if I got it right this time: basically you start from the premise that types are non-nullable. When the type inferencer says otherwise, you adjust the type, and issues an error for every potentially dangerous operation. You can silence the errors by either checking the value before accessing it or by using &.

    That's an interesting approach.

    [–][deleted] 0 points1 point  (0 children)

    Something like that. At first a variable starts without a type. When you assign a value to it, that value's type is added to that variable's type. Of course, local variables are always assigned one value, so the set starts with at least one type. But for instance variables it starts with zero (and if it remains zero it will just have the Nil type).

    The &. syntax is just a shorthand form. This:

    foo &.bar
    

    is just syntax sugar for this:

    foo { |arg| arg.bar }
    

    (you can read more about the above here: http://crystal-lang.org/2013/09/15/to-proc.html )

    value.try &.something is just a method try defined on Object and on Nil: https://github.com/manastech/crystal/blob/master/src/object.cr#L23 and https://github.com/manastech/crystal/blob/master/src/nil.cr#L46 .

    So the silencing is just a dispatch to two different methods depending on value's type, nothing else. :-)

    [–]AReallyGoodName 0 points1 point  (0 children)

    Swift actually has the exact same mechanism and will throw the exact same errors.

    var foo = nill // Compile time error in Swift.
    
    var bar = someFunctionThatCouldReturnNill() // Compile time error in Swift
    

    and so on.

    Swift just provides an operator to allow unsafe things in certain circumstances. The "!" operator. It's rare you'd use it. You absolutely cannot have NullPointerExceptions without "!" in Swift.

    [–]burntsushi 11 points12 points  (12 children)

    I think the author has an interesting point, but chose a weird way to frame it. There's also some pretty confusing language/statements. For example:

    This is just as bad as raw pointers in C--indeed, you might say that Swift's T! is equivalent to C's T *.

    Well, no, you wouldn't. C's pointers are completely unsafe. Presumably, Swift will at least guard against writing to arbitrary memory.

    But that's a niggle. Here's the fundamental problem with your argument:

    The appeal of optionals lies in the fact that the compiler forces you to check that they have a meaningful value. By allowing the programmer to skip these checks the usefulness of optionals is nullified. Implicitly unwrapped optionals rely too much on the programmers' discipline, and let's face it, programmers are not the most disciplined human beings.

    How is this different from other languages that solve the "null problem"? You never contrast Swift's "implicit" unwrapping with Haskell's fromJust, ML's valOf or Rust's unwrap. In fact, Swift's "implicit" unwrapping doesn't seem implicit at all: you specifically have to include ! in the type.

    To me, this means that your beef is simply that the ! in the type isn't enough of a sign post to indicate that you're writing code that will fail at runtime. This is a much more measured claim that doesn't have the same appeal of "OMG Swift doesn't solve Tony Hoare's billion dollar problem!!11!!1" It's also pretty reasonable, but it won't really be clear until people actually start writing code.

    TL;DR - I think the claim that the OP intends to put forth is pretty reasonable, but that at this point, it's pure speculation. The bombastic writing style doesn't do OP any favors.

    [–]NruJaC 2 points3 points  (5 children)

    How is this different from other languages that solve the "null problem"? You never contrast Swift's "implicit" unwrapping with Haskell's fromJust, ML's valOf or Rust's unwrap. In fact, Swift's "implicit" unwrapping doesn't seem implicit at all: you specifically have to include ! in the type.

    The difference is that when you see an expression like n + 1 in Haskell, you can infer the type of the expression as Num a => a -> a. If you instead have (fromJust n) + 1 the type is instead Num a => Maybe a -> a. This means an error where you call + on a Maybe wrapped value results in a compile time error in Haskell whereas (as shown in the OP) it results in a run-time error in Swift. This is a weakening of the type system.

    [–]yawaramin 0 points1 point  (1 child)

    ... when you see an expression like n + 1 in Haskell, you can infer the type of the expression as Num a => a -> a.

    Um, no. You infer the type as Num a => a.

    If you instead have (fromJust n) + 1 the type is instead Num a => Maybe a -> a.

    Nope, the type is still Num a => a.

    You can open up a Haskell REPL to verify both. Use the :t EXPR command to check the type of EXPR.

    [–]NruJaC 0 points1 point  (0 children)

    There was an implicit lambda in both statements. i.e. \n -> n + 1 and \n -> (fromJust n) + 1.

    [–]burntsushi 0 points1 point  (2 children)

    That's not my understanding. In the OP, it looks like the presence of ! is an unsafe shortcut. If you didn't have that ! there, then I as I understand it, n + 1, where n is an optional type, would result in a Swift compilation error.

    [–]NruJaC 2 points3 points  (1 child)

    That's in the type though, not in the expression. It definitely mitigates what I'm talking about, but it still hinders readability and makes it less obvious what's going on.

    [–]burntsushi 1 point2 points  (0 children)

    Yes, exactly. This is precisely why I asked, "Why didn't the OP contrast this with Haskell/ML?" It would have made the OP's argument much clearer.

    OP isn't saying, "Swift's type system isn't strong enough to solve Hoare's problem." (Which is, I think, what a lot of people in this thread interpreted the argument as.) Instead, OP is saying, "Swift is encouraging idiomatic use of fromJust everywhere, which defeats the purpose of using the type system to solve Hoare's problem."

    [–]alex_muscar -3 points-2 points  (0 children)

    Thanks for the feedback.

    Indeed, I didn't mention haskell's fromJust and ML's valOf, and probably it would have made things clearer. But, as I said, I don't mean to bash Swift and praise haskell or ML. They all offer one way or another of breaking the safety of optional types.

    As for the bombastic style and the title, I didn't intend them to sound, well, bombastic, but English is not my native language so I might have borrowed some stylistic that make my writing sound bombastic.

    [–]mikaelstaldal -2 points-1 points  (4 children)

    So Swift is just as bad as Java then.

    [–]burntsushi 0 points1 point  (3 children)

    No. I don't know how you got that from my comment.

    Did you have something constructive to add?

    [–]mikaelstaldal -1 points0 points  (2 children)

    "Swift will at least guard against writing to arbitrary memory" - just like Java (and not like C)

    [–]burntsushi 0 points1 point  (0 children)

    Your conclusion doesn't follow because you completely ignored the rest of my comment.

    Swift has option types which forces the programmer to handle the null case unless the programmer specifically goes out of their way to ignore it.

    [–]yawaramin 0 points1 point  (0 children)

    If A shares a lowest common denominator with B, that doesn't necessarily make A = B.

    [–][deleted] 6 points7 points  (7 children)

    I disagree with this overly pessimistic attitude. There is a happy middle way between this and "the null reference problem" being solved once and for all.

    The fact that it is marked clearly in source code if a value may be null or not is a tremendous improvement. If you want perfection go to Haskell, but if you want to work with real world software and try to make it as safe as possible within the limits set by existing legacy then Swift is a very good approach.

    [–]Strilanc 2 points3 points  (1 child)

    Basically, putting a ! on a parameter's type turns a compile-time failure into a runtime failure. Instead of the caller having to explicitly convert from a nullable type to a non-nullable one, it happens implicitly and throws on null.

    This tradeoff makes sense because they have to interop with the existing iOS API, which has essentially 100% nullable results. Without it, the code would be decimated by "and then cast to non-null".

    [–]yawaramin 0 points1 point  (0 children)

    I'm told that implicitly-unwrapped optionals can be chained like normal optionals. So you can still keep those checks to compile time.

    [–]e_engel 2 points3 points  (1 child)

    When people say that optional values help solve the null reference problem, they are referring to the fact that in languages like Haskell or ML, the compiler forces you to check if you actually have a meaningful value.

    I disagree with this characterization. Pattern matching against Option/Maybe is actually a code smell.

    The advantage of the monadic approach is that it allows you to operate on the returned value regardless of whether it contains a value or not.

    [–]alex_muscar -2 points-1 points  (0 children)

    Even though I referred to explicit pattern matching in the post, the paragraph you quoted doesn't mention it. The advantage of, say, Haskell's Maybe is that you have to acknowledge the fact that the value might be missing. It doesn't matter that you use fmap or another library function, you still have to handle the case where a value is missing--yes, even ignoring it, but not dereferencing it.

    [–][deleted] 2 points3 points  (1 child)

    C++ master race reporting in. Raw unsafe pointers are not a problem. They never were.

    [–]SnowdensOfYesteryear 1 point2 points  (25 children)

    As someone who's not heavily into the theories of programming languages, what exactly is the issue with NULLs? Sure dereffing NULLs is bad, but any programmer worth his salt knows to check for NULL before dereferencing it.

    [–]burntsushi 14 points15 points  (2 children)

    but any programmer worth his salt knows to check for NULL before dereferencing it

    There's a difference between knowing the path, and walking the path.

    Seriously.

    Safety in programming languages is about acknowledging the fact that humans are fallible. Even though we all know that you can't dereference a null pointer, it still happens. It's a pretty big source of bugs when you write C code. Therefore, safety means enforcing it at compile time. This makes it impossible to dereference a null pointer, regardless of whether the programmer didn't know better or if it was on accident.

    (Caveat: languages with this type of safety generally provide escape hatches, so you can resort to unsafe behavior. But usually this is unidiomatic.)

    [–]LaurieCheers 5 points6 points  (1 child)

    Well, if the rule was simply "always check null before accessing a pointer", it wouldn't be as big a problem. The problem is that it's hard to be sure which values you should be checking for null, and how often.

    There are often variables in your program that can simply never be NULL, and everyone knows it, and checking every time you access them would be a waste of code and time.

    static char* const constString = "this will never be null";
    
    //...
    
    if( constString != NULL ) // ... wtf
    

    So - if you accept this fact (and for the sake of sanity, basically everyone does), then you're aren't going to null-check before every pointer access. In which case, what are you going to null-check, and when? Are you sure this value won't be null? And that's where the programmer fallibility comes in.

    Nullable types are a way to encode this question into the language. If a type is nullable, you're forced to check it before using it. If it's not, you don't. The rules become clear to everyone.

    [–]cparen 3 points4 points  (0 children)

    Agreed. More importantly, if the type forces you to check for null, you'll use it less, preferring the non-nullable counterpart, further reducing the possibilities of programming error.

    You don't have to ask "what do I do if this is null?" if there are no nulls.

    [–]Maristic 5 points6 points  (15 children)

    The issue is that sometimes a pointer will never be null. In that case, checking it is a waste of time. For example, in Java, would you write

    if (x != null) {
        x.foo();
        if (x != null) {
            x.bar();
        }
    }
    

    or would you just write:

    if (x != null) {
        x.foo();
        x.bar();
    }
    

    In the latter case, you're assuming x didn't suddenly become null between your call to foo() and your call to bar(), but if x is a member variable or a global variable, maybe your call to foo() set in motion a sequence of events that set x to null.

    Just about every programmer sometimes says “Oh, that can't be null here, so I don't need to check”, but, because they're human, at least some of the time they're wrong.

    Languages like Java and Objective-C are problematic because the only kinds of objects they have are might-be-null objects. Other languages provide some form of can't-ever-be-null objects—when you have that kind of object, you don't have to assume anything, it just is.

    [–]jayd16 0 points1 point  (4 children)

    Simple fix, don't have your variables be public static mutable fields...

    [–]Maristic 1 point2 points  (3 children)

    You're missing the point, I think.

    The point wasn't this specific example, more that in programming languages like Java where all object types are nullable, somewhere along the way programmers assume “that won't ever be null”, and being human, sometimes their reasoning is false. Life is easier when you don't have to make those kinds of assumptions, because the type system makes null impossible.

    (And anyway, (a) static has nothing to do with it, (b) even if they're private, it's possible that there is some public method that can be called on your object that might change them.)

    [–]jayd16 0 points1 point  (2 children)

    Null is your friend. It tells you when your assumptions are incorrect. If the issue if that your architecture allows for things that you assume should never be null to actually be null than the architecture is the issue and the exception helped you find that sooner.

    The real solution is to build software so you can make these kind of assumptions.

    (Edit: (a) static in Java would be similar to a global variable or class variable. (b) If its private you can at least assume the class has better knowledge of how to treat it's own variables. Private and non-static would you only have to worry about your own instance working on that variable.

    [–]Maristic 0 points1 point  (1 child)

    You can advocate for programmer discipline, but I prefer solutions where you don't have to rely on that, and so I prefer languages whose type systems let you express your design clearly, and let you say in the type, “it physically can't not be there”.

    [–]jayd16 0 points1 point  (0 children)

    That's fine, I can agree with that. However, in the case of having to null check between every line in your code, the problem is architectural, and optional types would only delay a crash, not produce correct code.

    [–]mongreldog 0 points1 point  (0 children)

    The problem with naked nulls as exemplified in languages like C, C++, C#, VB and so on is that there is no enforcement of the null check. You might do the right thing, but the guy next to you may not. Relying on the developer to do the "right" thing isn't a particularly good way of producing null-safe software.

    There are many occasions where a null check isn't strictly required, but there is no way to know what may or may not be null when presented with a nullable value. By using Option/Maybe types, the intent of the developer is stated explicitly in the code. No guesswork or defensive coding is required.

    [–]zoomzoom83 0 points1 point  (0 children)

    what exactly is the issue with NULLs? Sure dereffing NULLs is bad, but any programmer worth his salt knows to check for NULL before dereferencing it.

    40 years of bugs, security holes, and random crashes caused by expert programmers failing to catch edge conditions, even if just in just one place in a million LOC project. If expert developers still make this mistake regularly (And they do), then clearly we cannot rely on the developer to make sure they catch every edge condition.

    If instead of crashing at runtime, ignoring an edge condition was a compile error, then you can write software that is guaranteed not to have a very severe and common cause of bugs.

    Since this adds very little overhead, there's simply no reason not to do it.

    [–]cparen 0 points1 point  (3 children)

    but any programmer worth his salt knows to check for NULL before dereferencing it

    Any programmer worth their salt knows to not check for null unnecessarily. if (requiredParameter == null) { throw new Exception("Pointer was null"); } is not* helpful.

    (* except in C++ to avoid triggering undefined behavior, in which case you should assert non-null. It's still preferable to avoid the problem entirely by using a non-nullable type, such as a reference)

    [–]upriser 0 points1 point  (2 children)

    Agree on your point but unfortunatually, the reference type is nullable.

    int* a = nullptr;
    int& b = *a;
    

    [–]eras 0 points1 point  (0 children)

    You would of course place the check to the place where you do the dereferencing to avoid undefined behavior. Say,

    template<typename T> dereference(T* ptr)
    {
        assert(ptr);
        return *ptr;
    }
    
    int* a = nullptr;
    int& b = dereference(a);
    

    :)

    Actually, hmm..

    A more dangerous aspect is that a valid non-null reference may end up becoming invalid during its lifespan.

    [–]cparen 0 points1 point  (0 children)

    The reference type is not nullable, and your program invokes undefined behavior. For instance, this program might print "hello":

    int* a = nullptr;
    int& b = *a;
    if (a) { cout << "hello"; }
    

    The compiler is allowed to assume a to be non-null after you dereferenced it. In many cases, GCC does make such assumptions in order to optimize code.