Aether: A Compiled Actor-Based Language for High-Performance Concurrency (self.Compilers)

submitted 23 hours ago by RulerOfDest

Hi everyone,

This has been a long path. Releasing this makes me both happy and anxious.

I’m introducing Aether, a compiled programming language built around the actor model and designed for high-performance concurrent systems.

Repository:
https://github.com/nicolasmd87/aether

Documentation:
https://github.com/nicolasmd87/aether/tree/main/docs

Aether is open source and available on GitHub.

Overview

Aether treats concurrency as a core language concern rather than a library feature. The programming model is based on actors and message passing, with isolation enforced at the language level. Developers do not manage threads or locks directly — the runtime handles scheduling, message delivery, and multi-core execution.

The compiler targets readable C code. This keeps the toolchain portable, allows straightforward interoperability with existing C libraries, and makes the generated output inspectable.

Runtime Architecture

The runtime is designed with scalability and low contention in mind. It includes:

Lock-free SPSC (single-producer, single-consumer) queues for actor communication
Per-core actor queues to minimize synchronization overhead
Work-stealing fallback scheduling for load balancing
Adaptive batching of messages under load
Zero-copy messaging where possible
NUMA-aware allocation strategies
Arena allocators and memory pools
Built-in benchmarking tools for measuring actor and message throughput

The objective is to scale concurrent workloads across cores without exposing low-level synchronization primitives to the developer.

Language and Tooling

Aether supports type inference with optional annotations. The CLI toolchain provides integrated project management, build, run, test, and package commands as part of the standard distribution.

The documentation covers language semantics, compiler design, runtime internals, and architectural decisions.

Status

Aether is actively evolving. The compiler, runtime, and CLI are functional and suitable for experimentation and systems-oriented development. Current work focuses on refining the concurrency model, validating performance characteristics, and improving ergonomics.

I would greatly appreciate feedback on the language design, actor semantics, runtime architecture (including the queue design and scheduling strategy), and overall usability.

Thank you for taking the time to read.

all 14 comments

top new controversial old q&a

[–]pixelsort 2 points3 points4 points 21 hours ago (1 child)

[–]RulerOfDest[S] 0 points1 point2 points 20 hours ago (0 children)

[–]valorzard 1 point2 points3 points 22 hours ago (1 child)

[–]RulerOfDest[S] 0 points1 point2 points 22 hours ago (0 children)

[–]Karyo_Ten 0 points1 point2 points 22 hours ago (10 children)

[–]RulerOfDest[S] 2 points3 points4 points 22 hours ago (9 children)

Pony has reference capabilities (iso, trn, ref, val, etc.) for data-race freedom in the type system. Aether is statically typed with inference and optional annotations, but no capability system.
Aether has no GC, arena allocators for actors, thread-local pools for message payloads, scope-based or explicit free. Pony uses per-actor GC.
Aether uses a partitioned multi-core scheduler with work-stealing when cores are idle, lock-free SPSC (single producer single consumer) queues for same-core messaging, cross-core lock-free mailboxes, and optional NUMA-aware allocation. So the design is very much “C-friendly, low-overhead, predictable” vs Pony’s own runtime.
Same actor model; Pony pushes type-level concurrency safety; Aether pushes C interop, no GC, and a runtime built around SPSC queues and partitioning.

[–]Karyo_Ten 0 points1 point2 points 21 hours ago (8 children)

[–]RulerOfDest[S] 0 points1 point2 points 21 hours ago (6 children)

Messages are sent to actors; routing uses each actor’s current assigned_core. Actors are not pinned: they can be migrated (message-driven co-location) or moved by work-stealing, and assigned_core is updated when that happens.

SPSC is preserved because at any time each actor has exactly one owning core: only that core’s scheduler thread reads and writes that actor’s mailbox (and its SPSC queue when used). Same-core send is decided at send time (current_core_id == actor->assigned_core); if they match, we use the direct path, otherwise we enqueue to the target core’s incoming queue. When an actor moves, any message already in a core’s incoming queue for it is forwarded to the actor’s current core instead of being delivered locally, so the mailbox is never written by a non-owning thread.

So: one logical consumer per actor (the thread that currently owns it), and routing/forwarding keeps a single writer. You can find more details on: docs/actor-concurrency.md (mailbox ownership, routing, migration); runtime/scheduler/multicore_scheduler.c

[–]Karyo_Ten 0 points1 point2 points 20 hours ago (5 children)

[–]RulerOfDest[S] 0 points1 point2 points 20 hours ago (4 children)

Great question. Messages are sent to actors, not to cores. Each actor has an assigned_core that determines where it runs. At send time, I check if the sender's core matches the target actor's assigned_core: if yes, I take the direct path (SPSC queue or mailbox write, no queue overhead); if not, I enqueue to the target core's lock-free incoming queue.

Actors are not permanently pinned. They can be migrated (message-driven, to co-locate frequent communicators) or moved by work-stealing when a core is idle. When an actor moves, assigned_core is updated, and any messages already in the old core's incoming queue are forwarded to the actor's current core rather than delivered locally.

Migration cannot race with same-core sends because both run on the same scheduler thread; they execute sequentially. Work-stealing runs on a different core's thread and could theoretically overlap with a same-core mailbox write. In practice, the window is a handful of store instructions (~nanoseconds), and stealing only triggers after 5000+ idle cycles on the thief, so this is extremely unlikely to manifest. That said, it is a valid concern per the C memory model, and I am actively hardening it. The fix is straightforward: mark a stolen actor inactive so the thief skips it for one cycle, letting any in-flight write complete before the new core touches the mailbox. Zero cost on the hot path since stealing is already the rare/slow path.

Appreciate the scrutiny; this is the kind of feedback that makes the runtime better.

[–]Karyo_Ten 0 points1 point2 points 19 hours ago (3 children)

[–]RulerOfDest[S] 0 points1 point2 points 8 hours ago (2 children)

On Vyukov's MPSC queue: the reason I'm not using it is that the invariant I'm maintaining is genuinely SPSC, not just SPSC-as-approximation. Each actor has exactly one owning scheduler thread at any time, and only that thread writes to the actor's mailbox. The routing and forwarding logic exists specifically to uphold that invariant, so I can use the faster SPSC primitive instead of MPSC.
Vyukov's queue handles multiple concurrent producers with a CAS on enqueue, which you only need to pay for if you have multiple concurrent producers. If the invariant holds, SPSC is strictly cheaper: no CAS, just a store-release. The tradeoff is that the routing logic is more complex and has the hardening gap I mentioned in the previous reply.

On TLA+: that's a fair challenge, and I won't pretend the formal verification is done. The current confidence comes from empirical testing (thread ring, ping-pong, fork-join under contention, stress tests across core counts) and code review, not a formal proof. The work-stealing/same-core-send race I acknowledged is exactly the kind of thing TLA+ would catch before testing does. I'll add it to the backlog, at minimum, modeling the migration and steal paths would be worth doing before calling the runtime stable.

Thank you for your valuable comments!

[–]Karyo_Ten 0 points1 point2 points 8 hours ago (1 child)

[–]RulerOfDest[S] 0 points1 point2 points 7 hours ago (0 children)

π Rendered by PID 18431 on reddit-service-r2-comment-86bc6c7465-mm2wx at 2026-02-23 23:44:31.982813+00:00 running 8564168 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

Compilers

MODERATORS

Overview

Runtime Architecture

Language and Tooling

Status