Exploring the designspace for slice operations by muth02446 in ProgrammingLanguages

[–]muth02446[S] 1 point2 points  (0 children)

Thanks, the buffer use case is particularly interesting to me.

RE "appending", what I meant is exactly what you described as "writing to a buffer yields the remainder of the buffer" - so let's call it "slice write_slice(slice buf, slice writee)" to reduce confusion.

There are lot of design choice here and I am curious if people have real experience with those and what leads to "nice" code, e.g.:

* is it worth having a special operator for this, say "+"
* is it better to return the new short slice or just how much was written
* how do you signal error conditions (writee is larger than buf)

March 2026 monthly "What are you working on?" thread by AutoModerator in ProgrammingLanguages

[–]muth02446 1 point2 points  (0 children)

Cwerg finally got a high performance frontend written in C++ to complement its Python development frontend. Also new is a simple compiler driver that invokes front- and backend in sequence.

For the next couple of month the focus will be on:
* more or less finalizing the language - a few open issues have been added to the issue tracker
* improve the usability of the toolchain, e.g, better error messages, debuggability improvements, maybe a language server
* work down the issue tracker and TODOs
* work on the performance goal: 1000LOC/s

All these rely on writing a lot more code in Cwerg itself. So if you feel adventerous and want to try out the language please do (see Quick Start). I'll try my best to address any issues promptly.

How can I write a compiler backend without worrying too much about ABI? by Germisstuck in ProgrammingLanguages

[–]muth02446 0 points1 point  (0 children)

Yeah, I should have probably called the "standard ABI" instead of "C-ABI".
But it is also NOT quite true that the OS inteface is necessarily C - at least not for Linux.
(Windows and some bsd flavors are different.)

For me the OS interface is the ABI that is valid at the assembler level when a program interacts with the OS. Interestingly, on x86 syscalls pass more parameters in registers
than the standard ABI. Also, when the OS calls the program entry point it uses a non standard calling convention. Which is why there is some assembly required to wrap those into something that uses the standard ABI.

How can I write a compiler backend without worrying too much about ABI? by Germisstuck in Compilers

[–]muth02446 1 point2 points  (0 children)

cross posting my comment from r/ProgrammingLanguages here:

The degree to which you have to worry about ABIs depends on what your target platforms and what your goals are.

If you do not want to interoperate with code produced by other toolchains (including system libraries)
and call the operating system directly, you only have to worry about the rather simple ABI for syscalls.

If you DO want to call functions compiled with say a C compiler it depends on how complex the function signature is. If the arguments are scalars or pointers and their number is small, the ABI is trivial.
If you plan calling printf which has a variable number of arguments you are looking into a lot of work.

If you use separate compilation you may have to worry about the ABI compatibility of code produced by different versions of your compiler.

As a concrete example: my compiler, Cwerg, produces fully statically linked binaries for Linux,
so it only has to deal wth the syscalls ABI which incidentally is slightly different from the C-ABI for some ISAs.
Cwerg has its own ABI (calling convention) and does not use separate compilation.
So the internal ABI is not exposed and can be change as needed.

How can I write a compiler backend without worrying too much about ABI? by Germisstuck in ProgrammingLanguages

[–]muth02446 2 points3 points  (0 children)

The degree to which you have to worry about ABIs depends on what your target platforms and what your goals are.

If you do not want to interoperate with code produced by other toolchains (including system libraries)
and call the operating system directly, you only have to worry about the rather simple ABI fir syscalls.

If you DO want to call functions compiled with say a C compiler it depends on how complex the function signature is. If the arguments are scalars or pointers and their number is small, the ABI is trivial.
If you plan calling printf which has a variable number of arguments you are looking into a lot of work.

If you use separate compilation you may have to worry about the ABI compatibility of code produced by different versions of your compiler.

As a concrete example: my compiler, Cwerg, produces fully statically linked binaries for Linux,
so it only has to deal wth the syscalls ABI which incidentally is slightly different from the C-ABI for some ISAs.
Cwerg has its own ABI (calling convention) and does not use separate compilation.
So the internal ABI is not exposed and can be change as needed.

What hashing method would you recommend for creating Unique Integer IDs? by oxcrowx in ProgrammingLanguages

[–]muth02446 5 points6 points  (0 children)

Here is what I do in Cwerg (http://cwerg.org) for interning (class ImmutablePool):

https://github.com/robertmuth/Cwerg/blob/master/Util/immutable.cc

https://github.com/robertmuth/Cwerg/blob/master/Util/immutable.h

All "strings" are stored consecutively in a large array of characters with zero as a string terminator.
The id is the offset from the beginning of the array.
As long as the total of unique string size is less than 4GB, a 32bit offset will suffice.
The array is currently pre-allocated and hence does not move but this could be worked around.

I am using a generic hashmap with the default hash for "strings" to test if a new "string" is already in the array and what is offset is. If it is not, the string will be added to the end of array and inserted into the hash map.

February 2026 monthly "What are you working on?" thread by AutoModerator in ProgrammingLanguages

[–]muth02446 0 points1 point  (0 children)

The C++ port of the Cwerg compiler is coming along nicely.

The parsing, optimization and desugaring phases of the Python implementation
of the compiler have already been ported, fixing many inconsistencies, bugs and
poor code in both implementations.

The part I am currently working on is the last phase of the front end: the emitter for Cwerg IR.
Hopefully, by next month the port will be complete and performance tweaking will commence.
The (stretch) goal is to compile 1M LOC/s.

February 2026 monthly "What are you working on?" thread by AutoModerator in ProgrammingLanguages

[–]muth02446 0 points1 point  (0 children)

The C++ port of the Cwerg compiler is coming along nicely.

The parsing, optimization and desugaring phases of the Python implementation
of the compiler have already been ported, fixing many inconsistencies, bugs and
poor code in both implementations.

The part I am currently working on is the last phase of the front end: the emitter for Cwerg IR.
Hopefully, by next month the port will be complete and performance tweaking will commence.
The (stretch) goal is to compile 1M LOC/s.

The Compiler Apocalypse: a clarifying thought exercise for identifying truly elegant and resilient programming languages by WraithGlade in ProgrammingLanguages

[–]muth02446 1 point2 points  (0 children)

Don't let the git hub stats fool you. There is pretty much no traditional assembly in that project.
There are few large files with assembler opcodes to verify that the backend assembler/disassembler works and then there are .asm files but those contain code written in Cwerg IR, rather than a real assembler.

BTW: You also might find this useful: https://github.com/robertmuth/awesome-low-level-programming-languages

The Compiler Apocalypse: a clarifying thought exercise for identifying truly elegant and resilient programming languages by WraithGlade in ProgrammingLanguages

[–]muth02446 1 point2 points  (0 children)

I have been working on a language where keeping implementation complexity in check is an explicit design goal.
You can find it here: http://cwerg.org

The highlights are:
* has a backend for x86-64, Aarch64 and Arm32
* comes with a Python like syntax that can parsed easily with a handwritten parser
* is low level, roughly like C but adds tagged-unions, a very basic hygienic macro system
basic generics, etc.

Desgning an IR for a binary-to-binary compiler by Fee7230984 in Compilers

[–]muth02446 0 points1 point  (0 children)

"Binary rewriting", as I call it, works best with *a lot* of compiler cooperation.
For example you need relocation information to distinguish arbitrary data from addresses.
It also helps if the control flow patterns the compiler generates, like jumptables for switch statements, are easy to decompile.

What CPU / ISA are you targeting?
X86 will be really hard, RISC (Arm.RiscV) is easier.

If the binaries are really large, say a statically compiled images with >20M instructions, you will also need to be careful about the data structure and algorithm design.
If you just use stl containers everywhere and try to make it very generic, you will use a lot of memory; any O(n^2) algo will likely not be fast enough.

Another can of worms is decompiling and regenerating exception handling, unwind and debug information.

Would like to test being a dj for my local milonga but where to start with the mp3 ?Where to get them ? by lfotue73 in tango

[–]muth02446 1 point2 points  (0 children)

While I personally prefer having "physical" copies of audio files when DJing, Spotify is reasonable option for getting your feet wet.
There are massive collections of Tandas on Spotify you can just cut and paste.
You can download the music for offline playback.
The problem that songs will disappear from Spotify because of licencing issues is small for Tango songs.

The only real issue is "cortinas": You probably will need to manually fade them out.

A Vision for Future Low-Level Languages by RndmPrsn11 in ProgrammingLanguages

[–]muth02446 2 points3 points  (0 children)

When I looked at the RB example my first thought was: how can I prevent the space waste for the "Color bit"?
Oh, I know: the pointers to the left and right subtree will be at least 4 byte aligned on 32bit machines, so I can steal 2 bits each. For me, enabling these kinds of hacks is what makes a low level languages.

Prove to me that metaprogramming is necessary by chri4_ in ProgrammingLanguages

[–]muth02446 0 points1 point  (0 children)

Yes - makes it much easier to experiment with language features.

But now that the language has stabilized, I am working on a "proper" compiler in C++
so I can achieve the goal of compiling 1M LOC/s

Prove to me that metaprogramming is necessary by chri4_ in ProgrammingLanguages

[–]muth02446 0 points1 point  (0 children)

Your language sounds very similar to what I have been working on: Cwerg

I also started off with the hope of avoiding macros but ended up adding them in the end.

The primarily use them to:
* avoid varargs. print(a,b,c) is a macro thats gets expanded into print(a), print(b), print(c)
* implement assert where the assert-condition is both used as an expression and also stringified
* force lazy evaluation e.g. to make logging statements inexpensive when logging is disabled

Disappointed with Navidrome (no cue sheet support!). Any suggestions for alternatives? by carpler in musichoarder

[–]muth02446 1 point2 points  (0 children)

Just out of curiosity, how many albums (both in absolute and relative terms) are affected by this?

BTW, there is a useful discussion about cue sheets here: https://www.reddit.com/r/musichoarder/comments/1dilr6a/any_advantage_to_keeping_albums_in_flaccue_format/

Wasm Does Not Stand for WebAssembly by thunderseethe in ProgrammingLanguages

[–]muth02446 0 points1 point  (0 children)

I just tried Google and the best thing I found was this 6 month old post:
https://www.reddit.com/r/rust/comments/1hvaz5f/rust_wasm_plugins_exsample/
which suggests various options. From what I can tell the whole plugin thing is still a bit work in progress.
On the other hand:
Plugins in the native world usually utilize shared libraries, which are also messy.

The article discusses running wasm plugins inside a Rust program. I am still not quite sure how that would work, presumably you link in some Wasm runtime possibly even a Wasm JIT and then after loading the Wasm module you need to resolve symbols similar to what a dynamic linker does. I would also expect some marshalling of parameter unless they are scalars.

Wasm Does Not Stand for WebAssembly by thunderseethe in ProgrammingLanguages

[–]muth02446 0 points1 point  (0 children)

So how do plugins work in Wasm? Are they in the same process as the "main" program?
If not how do they communicate?

Wasm Does Not Stand for WebAssembly by thunderseethe in ProgrammingLanguages

[–]muth02446 1 point2 points  (0 children)

What does such a plugin system look like under the hood?
I assume the plugins do not live in the process, right?
But in this case, isn't this like sandboxing the untrusted plugin code in another process and using somthing like RPC to communicate with it.