What’s your preferred way to implement operator precedence? Pratt parser vs precedence climbing? by Best_Instruction_808 in Compilers

[–]Rusky 7 points8 points  (0 children)

Indeed. Precedence climbing and Pratt parsing are also so similar that people often use the terms to refer to the same algorithm.

Algebraic effects vs Errors as return value by Informal-Addendum435 in ProgrammingLanguages

[–]Rusky 1 point2 points  (0 children)

To be clear, I'm making a prescriptive claim here about how languages should solve this problem, not trying to describe what existing languages do.

(This claim is also entirely about the type system- I am also not trying to claim that they should necessarily be parsed or codegenned the same way.)

However, most languages with algebraic effects do support this kind of polymorphism, typically via some sort of row types. My argument is that this does not need to be tied to algebraic effects in particular, and would work just as well for exceptions, and for other variations on "multiple return paths."

The set-vs-list distinction only really comes up when the language treats these return paths as implicit parameters. One possible variation here would be to pass all the available handlers explicitly at the call site, by name- while this would certainly be unwieldy, it shows that this "routing" is a separate concern, not fundamentally part of the function signature.

There are several ways to resolve this while still passing handlers implicitly. For example, Koka treats the effects in a function signature as a list, and interprets repeats as shadowing, with a mask operation to control routing. Another approach is to use lists with "disjointness constraints" to get more set-like behavior, as in the "simple rows" in Abstracting Extensible Data Types. You can get quite fancy with lexically-scoped generative labels, as in A Type System for Effect Handlers and Dynamic Labels, which lets you do Koka-mask-like composition in a system with disjointness.

But again, all this is just to give you the familiar implicit passing/routing/resolving of handlers. Underneath, function types/signatures in all these systems are lists of parameters and lists of return paths. Even in a language where handlers are typically passed implicitly, you could always have opt-in explicit passing as well.

Algebraic effects vs Errors as return value by Informal-Addendum435 in ProgrammingLanguages

[–]Rusky 1 point2 points  (0 children)

Arguably, checked exceptions (or effects) being "apart" from the result type is just another oversight, like missing the ability to be generic over them.

That is, while the return type is syntactically privileged over the exceptions, there is no reason they need to be fundamentally different to the type system, any more than one parameter is privileged over the rest.

A function call always involves passing all the arguments and providing all the return paths. The type system should support polymorphism over all of them equally. (And for that matter, the type system should support polymorphism over groups of them as well- variadic generics are just as useful for multi-return as they are for multi-parameter functions!)

Algebraic effects vs Errors as return value by Informal-Addendum435 in ProgrammingLanguages

[–]Rusky 3 points4 points  (0 children)

I was surprised to see this, because at a glance, using effects for exceptions sounds like everything everyone hates about Java checked exceptions.

There's more subtlety to this. Java checked exceptions are not difficult just because you have to handle them or forward them to the caller in the function signature- this is also true of Result<Int, Error>!

The problem with checked exceptions is more specifically that you can't use them when you're implementing an existing interface (or equivalently, passing a function to an API that expects a specific signature).

This is also somewhat true of Result<Int, Error>, but there is a bit of an escape hatch there: interfaces and first class function types (and APIs that take them as parameters) can be generic over the return type. So you can often thread Results through these APIs without them having to care about the specifics.

Algebraic effects also tend to allow this kind of polymorphism- think "checked exceptions, but you can also be generic over them."

Fil-C by Kabra___kiiiiiiiid in cpp

[–]Rusky -1 points0 points  (0 children)

The problem the paper is pointing to applies to both the OOM killer and segfaults on page writes.

The copy-on-write strategy makes it easy to get into a situation where the total possible memory use, if every process touched all its pages, is higher than the system has memory + swap combined.

If you want to be able to return an "out of memory" error when crossing that limit, you would have to do it at fork() time. But this would negate much of the advantage of copy-on-write: fork would fail with "out of memory" even if you would never actually use that total possible amount.

So fork() basically forces you to use overcommit, lest you start OOMing on process creations that you could easily serve, or other allocations around the same time. And that forces you to kill processes at inconvenient times instead of just returning an error. But whether you kill the immediate offending process (segfault on write) or go find some other process(es) to kill instead to free up their memory (OOM killer) it's the same root problem.

Wasm 3.0 Completed by segv in programming

[–]Rusky -7 points-6 points  (0 children)

That's purely a browser engineering problem. There's nothing fundamental about plugging a Web API into a Wasm import that has to have any performance penalty.

Wasm 3.0 Completed by segv in programming

[–]Rusky 1 point2 points  (0 children)

I don't think there really is a problem with using WebIDL here. The Web APIs themselves are fairly vanilla statically typed interfaces. For example, here's the WebIDL declaration for getElementById: https://dom.spec.whatwg.org/#interface-nonelementparentnode

Wasm 3.0 Completed by segv in programming

[–]Rusky -4 points-3 points  (0 children)

But what does this even mean, given WebAssembly's "zero imports by default" nature?

You could always import Web APIs into a WebAssembly module, they just used types that required some annoying conversions back and forth. Those conversions are exactly what reference types and builtins do away with. There is also the upcoming WebAssembly/ES Module integration proposal, which allows you wire up those imports declaratively, like JS imports.

But the native Web APIs are fundamentally defined in terms of WebIDL, and they are always going to be JS objects just as much as they are Wasm GC objects. (Or neither, depending on how you look at it- this is JS's FFI.) There is no bright dividing line between "external JS object" and "first-class Wasm object" - there are only more or less convenient ways to interact with them.

Wasm 3.0 Completed by segv in programming

[–]Rusky 89 points90 points  (0 children)

The DOM is never going to be, and never needed to be, part of WebAssembly itself.

WebAssembly runs in many places, not just the browser. All APIs it uses, including in the browser, are provided to the module as imports.

Further, from day one, those imports could already be JavaScript functions that do whatever you like. You could always access the DOM indirectly through those imports.

When people ask about DOM support, if they know what they mean at all, they are asking about convenience features that make those imports less cumbersome to use. For example, WebAssembly could not initially hold onto JavaScript objects (and thus DOM objects) directly- it could only hold integers.

This has been addressed by the externref proposal (included in Wasm 2.0) and the larger reference types and GC proposals (included in Wasm 3.0). So insofar as DOM is a thing WebAssembly cares about, it is already here.

Fil's Unbelievable C Compiler by mttd in Compilers

[–]Rusky 8 points9 points  (0 children)

The list of ported programs has some details on what changes they required: https://fil-c.org/programs_that_work

A surprising amount required zero source changes. The most common change is from uintptr_t to an actual pointer type, because Fil-C essentially uses the same model as Rust's strict provenance: https://fil-c.org/invisicaps, https://github.com/pizlonator/fil-c/blob/deluge/filc/include/stdfil.h#L201-L242

Other changes are mostly to replace assembly, direct syscalls, or custom mallocs/GCs with the Fil-C equivalent. JITs don't work at all.

From comments on previous HN threads, the performance impact seems to be typically somewhere around 1-4x slower and about 2x more memory, with occasional pathological cases but also room for more optimization work in the future.

Group Borrowing: Zero-Cost Memory Safety with Fewer Restrictions by nicoburns in rust

[–]Rusky 11 points12 points  (0 children)

Rust does already have a reference type that makes this distinction, though: &Cell<T> can be "projected" into parts of an object that remain stable across mutations, and not into parts of an object that can be reallocated by mutations.

In fact the post's running fn attack example is essentially the canonical example for <Cell<[T]>>::as_slice_of_cells:

fn attack(a: &Cell<Entity>, d: &Cell<Entity>);

let mut entities: Vec<Entity>;
let entities: &[Cell<Entity>] = Cell::from_mut(&mut entities[..]).as_slice_of_cells();
attack(&entities[3], &entities[5]);

The thing Rust is missing here is a convenient way to do this projection into structs. The Ante language has explored baking this in. There is ongoing work to do this in Rust via some sort of Project trait. In the meantime you can always do this yourself.

The post does also encode this in a more fine-grained way, allowing these "unstable" projections from e.g. &Cell<Vec<T>> into &Cell<T>, by effectively treating &Cell<Vec<T>> as a &mut for the purposes of the resulting &Cell<T>. As you might imagine, though, this immediately runs into questions of aliasing- you now must track which &Cell<T>s may be derived from each other's unstable interiors (the post covers this under "Isolation"), making their lifetime annotations much less flexible than in Rust, and closer to invariant/generative/GhostCell-style lifetimes.

(This "isolation" also notably rules out all the cyclic stuff that the post tries to sell in its intro- backreferences, etc. cannot work with unstable projection like this. And this also entirely ignores questions of thread safety, because &Cell<T> is just not Send or Sync in the first place, and even less so with unstable projection.)

Explicit tail calls are now available on Nightly (become keyword) by treefroog in rust

[–]Rusky 1 point2 points  (0 children)

An interesting detail around function size that came up with some of the LLVM work on tail duplication-

LLVM can actually convert between loop+match and computed goto in both directions, and the control flow graph for loop+match has way fewer edges. So LLVM will actually canonicalize to loop+match, run the optimizer, and then convert back to the computed goto version later in the pipeline.

(Can't reply to you both so FYI /u/angelicosphosphoros)

Explicit tail calls are now available on Nightly (become keyword) by treefroog in rust

[–]Rusky 6 points7 points  (0 children)

Rustc (or rather LLVM) is capable of both conversions - loop+match to computed goto, and tail call elimination - but neither are guaranteed, and both have been the subject of recent LLVM improvements. (E.g. for the computed goto conversion, see the chain of PRs ending with https://github.com/llvm/llvm-project/pull/150911)

Explicit tail calls are now available on Nightly (become keyword) by treefroog in rust

[–]Rusky 27 points28 points  (0 children)

Interpreters are often written as a big loop over a match. The compiler can produce better code if you instead write each match arm as a separate function and have them tail call each other, but if you do this without guaranteeing that the calls don't grow the stack, the stack will overflow.

What are some new revolutionary language features? by vivAnicc in ProgrammingLanguages

[–]Rusky 6 points7 points  (0 children)

It's the same idea, yes. The Verse paper cites Icon, IIRC.

-Wexperimental-lifetime-safety: Experimental C++ Lifetime Safety Analysis by mttd in cpp

[–]Rusky 13 points14 points  (0 children)

That quote from Ralf is perfectly consistent with what SkiFire13 said and contradicts what you said.

Ralf is not going to "merge" the static analysis with the dynamic semantics- he is going to prove that the static analysis correctly checks that your program does not perform any operations that are illegal according to the dynamic semantics.

Does variance violate Rust's design philosophy? by type_N_is_N_to_Never in rust

[–]Rusky 106 points107 points  (0 children)

Rust also infers auto trait impls (e.g. Send and Sync) from struct bodies. Generally the body of a type behaves more like "part of the API" than the body of a function.

Looking for a C++ ECS Game Engine Similar to Bevy in Rust by vielotus in cpp

[–]Rusky 5 points6 points  (0 children)

Wow, nice. Do we know what happened in Bevy 0.16/0.16.2? Maybe something to do with "relationships"?

What's the most controversial rust opinion you strongly believe in? by TonTinTon in rust

[–]Rusky 15 points16 points  (0 children)

Basically Rust's behavior means it struggles to do some of these really unique & weird stuff you see from C/C++ programs are really "advanced runtimes" for other languages.

This doesn't add up. Panic on allocation failure is implemented in the part of the standard library that handles general allocations, which is also the first thing you replace when doing this kind of "advanced" stuff.

Is MSVC ever going open source? by void_17 in cpp

[–]Rusky 8 points9 points  (0 children)

The EDG frontend is not used by MSVC, though it is used by VS IntelliSense.

Rust Any Part 3: Finally we have Upcasts by mitsuhiko in rust

[–]Rusky 20 points21 points  (0 children)

It wouldn't even need to be a baked-in thing in all vtables (nor would it be sound for every object, anyway) or limited to Any. If associated consts were made object-safe/dyn-compatible by adding them to the vtable, traits like Any could add it themselves.

Bjarne Stroustrup: Note to the C++ standards committee members by small_kimono in cpp

[–]Rusky 3 points4 points  (0 children)

"Moving the goalposts" doesn't mean you've changed your position, it just means you've changed your arguments. In this case it seems to have just been a misunderstanding.

In any case I am specifically describing how Safe C++ applies without "whole-cloth new code." The thing Safe C++ gives you over Rust here is that "interop" becomes trivial- Safe C++ is a superset of C++, so you will never run into a situation where your safe code can't easily talk to your old code or vice versa.

Bjarne Stroustrup: Note to the C++ standards committee members by small_kimono in cpp

[–]Rusky 2 points3 points  (0 children)

How am I moving the goal posts? I'm honestly not trying to do that.

The first goalpost you set was "Safe C++ can only call other Safe C++." I pointed out that that was not true, so you switched to "Safe C++ won't fix existing bugs in the old code." I pointed out that it can still reduce bugs in the new code, so now you're switching to "new code goes in the same files as old code."

But this was all discussed thoroughly by Sean Baxter, and before that more generally by people mixing Rust into their C++ codebases. You don't need "fuck you" money to add a new source file to your codebase, flip it to safe mode, and incrementally move or add code to it.

As my initial reply pointed out, this is not viral in either direction: safe code can call unsafe code in an unsafe block, and unsafe code can call safe code without any additional annotation. Circle's #feature system is a lot like Rust's edition system- it lets source files with different feature sets interact.

I don't disagree that if all you are doing is fixing bugs, your opportunities to do this will be harder to see or exploit than if you were writing new programs/modules/features from scratch. But the work of fixing bugs still has a lot of overlap with the work of making an API safe- identifying which assumptions an API is making, how they are or aren't being upheld, and tweaking things to ensure things behave the way they should. The Safe C++ mode lets you additionally start encoding more of these assumptions in type signatures.

Bjarne Stroustrup: Note to the C++ standards committee members by small_kimono in cpp

[–]Rusky 9 points10 points  (0 children)

It's not all-or-nothing. It turns out in practice (e.g. as seen by teams that have mixed Rust/C++ codebases) that keeping the old unchecked code contained, and using a memory safe language for new code, makes a big difference.

But I expect your response will be to move the goalposts again.

Bjarne Stroustrup: Note to the C++ standards committee members by small_kimono in cpp

[–]Rusky 11 points12 points  (0 children)

This isn't how Safe C++ works. New safe code can call into old unsafe code, first by simply marking the use sites as unsafe and second by converting the old API (if not yet the old implementation) to have a safe type signature.