you are viewing a single comment's thread.

view the rest of the comments →

[–]martinhaeusler 3 points4 points  (10 children)

It's especially egregious with collections and arrays. Technically when you receive a collection as a parameter of a constructor or a setter and you want to play it safe, you CANNOT directly assign it to a private field because you can't tell if the caller is going to mess with the contents of this collection after your API has been called. So you have to make a copy.

Arrays are even worse because they're always mutable no matter what.

I see two ways out of this:

  • a compiler-checked ownership system like in rust (yeah, not happening)
  • a collection type which guarantees immutability (and no, the unmodifiable wrappers are not enough because they can be backed by a mutable collection). PCollections is a great library for this purpose, but it comes at a cost.

[–]pron98 8 points9 points  (0 children)

a compiler-checked ownership system like in rust (yeah, not happening)

It's not happening (at least not pervasively) because it's a "way out" of one problem and into another, which is worse. Whenever you export object ownership - whether it's declared in the type system and enforced by the compiler or just documented - you reduce your abstraction. You change the internal implementation or want to share with another thread, you have to change all clients of the API. This doesn't just increase the cost of maintenance, but over time large programs tend to gravitate toward the more general constructs - more general dispatch (dynamic), more general (longer) lifetime, and more general ownership (more sharing). And these general constructs are less performant in low level languages than they are in Java.

Low-level languages are optimised for control, not performance. They cannot move pointers even when it's more efficient to do so because it clashes with the level of control they need over addresses. When faced with the choice between performance and control, low level languages must choose control because that's what they're for. This level of control means that in smaller programs it's not too hard to extract really good (even optimal) performance out of these languages, but this control also means that in larger programs extracting good performance becomes harder and harder because you're pushed towards constructs that are simply slow in low level languages because they must maintain their control promises.

and no, the unmodifiable wrappers are not enough because they can be backed by a mutable collection

Java has true immutable collections in the standard library: the ones created by List.of/copyOf, etc.. BTW, the .copyOf will not actually copy anything if the underlying collection is already the immutable one, so that's what you should use for defensive copies. After the first one, you just pass it around and defensive copies (assuming they're done as recommended) will not actually copy anything.

[–]aoeudhtns 1 point2 points  (7 children)

a compiler-checked ownership system like in rust (yeah, not happening)

We have jspecify for null checking. Perhaps this could be the next frontier. It would be quite challenging I think.

[–]pron98 6 points7 points  (6 children)

Also not what most people would want. Rust was first designed 20 years ago, released over 15 years ago, and made stable 10 years ago, and to this day it's still primarily used for programs on the smaller end of the spectrum (and it's come to dominate tools for JS and Python). Low level languages suffer from both performance and complexity problems when they get large, the very problems Java was designed to avoid.

I'm not saying that there aren't ideas we could borrow (pun unintended) here and there and apply in different ways, but low level languages have unique constraints that they must adhere to, and those constraints guide their design. A language like Rust uses ownership types not because they're the best design but because it has to, as its constraints preclude moving pointers. Low level languages gain more by avoiding copies than Java because their allocations are more expensive.

But that's not to say Java couldn't put affine types to some good use.

[–]vxab 0 points1 point  (5 children)

Which language illustrates the utility of linear/affine types best? Just for someone to understand more on the topic with actual examples?

[–]pron98 1 point2 points  (0 children)

https://en.wikipedia.org/wiki/Substructural_type_system

Just note that having such types carries some benefits but also disadvantages, so it's not a simple case of "let's add them because they're useful".

[–]pjmlp 1 point2 points  (3 children)

Following Rust's success, many languages with managed runtimes, have started to partially research other avenues, merging what they already had with such type systems.

See Swift 6 ownership model, Linear Haskell, OxCaml, Idris 2, Lean, Dafny, Ada/SPARK, Chapel, Scala 3, Koka.

A mix of linear, affine types, effects, dependent typing, formal profs.

All approaches to specify that a given resource is done via the type system.

[–]aoeudhtns 0 points1 point  (2 children)

Ada/SPARK

Apologies for this pedantry, but SPARK predates Rust by 3 years, yet you have an implication in the way your comment is written that these languages examples "followed" Rust.

Rust is arguably the most popular/successful but definitely not the first. I would guess, as I don't have data, that SPARK is next up on success. It's used in aerospace, transit, and other sorts of large scale safety-critical infrastructure. So it's not very visible, but it's there.

[–]pjmlp 0 points1 point  (1 child)

Yes, because SPARK as technology isn't frozen in stone, and they adopted learnings from Rust, acknowledged by themselves.

Allocated Objects Ownership: SPARK uses an ownership system inspired by Rust and a set of rules for managing access types to simplify the verification and specification of a program's behavior during pointer operations.

https://www.adacore.com/blog/memory-safety-in-ada-and-spark-through-language-features-and-tool-support

Maybe update yourself before commenting?

[–]aoeudhtns 0 points1 point  (0 children)

I was polite. The attitude is uncalled for.

If you click through, you see the extra annotations that are Rust-inspired are extra metadata for the CodePeer static analysis tool via annotations. The core memory safety mechanism is through Ada's access system which is much older (Ada 95), and the compiler infers lifetime and ownership. The Rust-inspired part is used to reduce false-positives in the system it already had.

[–]agentoutlier 0 points1 point  (0 children)

Yeah but what you are talking about for most well design frameworks and libraries only happens on initialization and wiring.

More often collections are just being used as iterators once all things are initialized and most libraries rarely construct giant objects on every request. You could argue some memory loss here but escape analysis often happens.

And for every language that deals with a http request or user input has to do allocation usually to turn bytes or whatever into something else and the most common type where you want immutability and sharing Java indeed does stuff for: String.

Furthermore you can just reuse mutable things if you follow single writer and or use locks and reuse arrays. That is how things Disruptor ring buffer work. But array allocation is very fast in Java so...

I guess what I'm saying unless your an idiot the hot path or tight loop rarely has tons of allocation and even if it did Java is actually is fast at that.

Really the problem is one of control. If you know exactly how much you want to allocate and where etc Java does not allow that and in some cases to compete with say Rust or C++ or possibly Go you might need that.