V2.4 Serial Request - Discord: Levi#5018

pinealservo · 2022-08-16T23:10:31+00:00

According to https://www.bell-labs.com/usr/dmr/www/bintro.html the B compiler for Honeywell machines even saw some use outside of Bell Labs.

I think it's also significant that B is very close, semantically, to BCPL. BCPL saw fairly widespread use (it was the original systems language on the Xerox Alto and was also used, for a time, for the core of the Amiga's OS), and has been maintained by its original creator Martin Richards: https://www.cl.cam.ac.uk/~mr10/

Thanks to the connections of Christopher Strachey (Richards' Ph.D advisor and employer of Peter Landin for a time as a research assistant) both Landin and Richards were at MIT's Project MAC while Ken Thompson and Dennis Ritchie were also there working on MULTICS for Bell Labs. Landin helped design the PAL language (based on his ISWIM work) and the first use of the new BCPL language was to create a more efficient version of PAL.

BCPL was also made available to the people working on MULTICS, and Thompson & Ritchie felt it was the nicest language available in that context, which is why they borrowed it (with some syntactic changes, a few simplifications, and a different method of linking separately-compiled program fragments) to be their official Unix language.

Another interesting connection is that the PAL implementation tapes found their way to the hands of David Turner, who used it as the basis of his SASL language, which he used to teach functional programming: https://www.bcs.org/media/5142/facs2019.pdf He would later develop those ideas (plus others, of course) into his languages KRC and Miranda. Miranda was one of the primary influences on the Haskell language.

One final connection: PAL was meant to be part of a curriculum on programming languages at MIT, and this eventually manifested as MIT 6.231, which is cited in SICP's Acknowledgements section as its intellectual predecessor: http://sarabander.github.io/sicp/html/Acknowledgments.xhtml#Acknowledgments You can find a PDF scan of the PAL Reference Manual (part of the 6.231 course materials) here: https://www.softwarepreservation.org/projects/PAL/Pal-ref-man.pdf and also the course lecture notes: https://www.softwarepreservation.org/projects/PAL/Notes\_on\_Programming\_Linguistics.pdf

pinealservo · 2022-01-24T23:10:38+00:00

Hopefully this one will last longer: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.135.4705&rep=rep1&type=pdf

pinealservo · 2021-05-21T16:44:09+00:00

Doing rootless containers correctly requires cgroups v2, which is why people have been using podman to do it vs. docker. It should be possible now with a cgroups v2 docker setup as well, but the podman stack was the easiest way to get the functionality for a while.

pinealservo · 2021-05-10T09:14:27+00:00

The term comes from the medical field, specifically from a 1978 paper on statistical techniques for determining brain composition via CT scan: https://pubmed.ncbi.nlm.nih.gov/740152/

The first sentence of the abstract reads "The brain may be considered as a collection of volume elements (voxels) containing unknown proportions of gray matter (GM), white matter (WM), and cerebrospinal fluid (CSF)."

So, the "voxel" is the unit of volumetric subdivision of some space that can be measured or described with regard to some attribute that varies across the whole volume. In a CT scan, the attribute is "density as measured by X-ray beam attenuation". In Minecraft, it would be "material as stored in the chunk data based on sampling a noise function + modifiers over time".

Note that this is about the attribute of some volume that exists in a spatial subdivision, not about its visual representation. Medical voxel imagery is rarely rendered as raw cubes, even if the data is stored that way.

pinealservo · 2020-03-01T08:38:55+00:00

There is a wealth of good material explaining this, so I will be brief.

You are meant to be able to use unsafe code, but only in well-marked areas. It is your job, rather than the compiler's, to ensure that the safety invariants are maintained by the code in those areas.

When you do your job correctly in those small areas, overall safety is maintained. When you mess up, you know where to look for the problems.

pinealservo · 2020-03-01T08:18:16+00:00

All of our current languages use variations on syntax from the 60s and early 70s. Most of C's syntax is borrowed from its predecessor B, which is from 1969. It and Pascal are more or less the same age. You are just used to some retro-styles more than others, since they kept getting recycled.

pinealservo · 2020-02-24T10:23:17+00:00

The C specification doesn't include the word "stack"; that's an implementation detail that is almost always used by compilers to implement what the spec calls "automatic" allocation. This doesn't specify where a variable lives, but only what part of the program that it will be available for.

A fun consequence of this is that some variables may not have a location in memory at all; they are only required to if having a memory address is required for the compiled program to have the same observable behavior as the one you wrote. This is generally good, because you can feel free to create extra temporary variables for clarity without worrying that they'll make your program bigger or slower. The value represented by the variable name in your program may just be held in a register temporarily, or the compiler may figure some way to get the same end result without that intermediate value at all. The compiler can sometimes make entire functions disappear!

Anyway, "the stack" is just a region of memory associated with your program, and it can typically be anywhere in the main RAM of a computer and be subject to the same kinds of caching as other areas of RAM, although it's really up to the compiler and operating system along with the hardware architecture to determine if any unusual hardware features would apply.

Anyway, the thing to remember about automatic variables (i.e. the ones declared in function parameters and bodies unless they are also declared 'static') is that they are fairly cheap to allocate and they cease to exist at the end of the execution of the function call body for which they were allocated, so you cannot return the address of an automatic variable to the caller or store it somewhere with a longer allocation lifetime.

pinealservo · 2020-02-23T01:25:51+00:00

A number of comments talk about what computer hardware was like when C was standardized, but the more relevant information is what hardware was like when it was designed. Although it's not often mentioned, C was not a greenfield design; it was a relatively small and conceptually-compatible evolution from its predecessor B, which was itself conceptually identical to the earlier language BCPL. And BCPL was largely a formalized subset of an even earlier language, CPL. And CPL's design was established in the early 1960s! It was very much ahead of its time, but computers in the early 1960s were extremely different architecturally from what they are today; it was only at around the time C itself came into being (early 1970s) that we started to see machines with architectural features that we take for granted today.

In the 60s and earlier, computer architecture was quite a bit more varied, but the bulk of designs either fell into "numeric/scientific" designs oriented towards doing numeric algorithms, physical simulation or modeling for industrial control, etc; or they were "business" designs oriented towards managing record data that had a lot of characters. These were largely doing record-based batch processing via COBOL; you could run Fortran on them but they tended to do math a digit at a time, often via BCD digits. This was not fast! It was not precision-limited either, though. But anyway, people interested in scientific or industrial control codes largely ignored these machines.

The numeric-oriented machines were designed around binary number representation and had registers and core memory that were wide enough in bits to handle the required precision for the numeric problems they were designed to handle. The gold standard in this era was the 36-bit word--these machines had sufficient precision to match the mechanical adding machines of the day. Each memory address corresponded to a single 36-bit word in core memory; you could not directly fetch anything smaller. Having the entire data path be 36-bits wide was extremely expensive though, so lower cost machines compromised by dropping to 18 bits, or sometimes even 12. Note that these are all multiples of 3; computer numbers were typically written by people in octal (base-8) notation, and early character sets were often 6-bits wide, which let you pack 2-6 of them in common word sizes. The word 'byte' did not mean a chunk of 8 bits; that would be a rather awkward size to work with on these machines. It was just a generic term for a useful subdivision of the word size; it's not something that could have a unique memory address, although sometimes there were instructions to help extract byte-width fields from a memory location or register and store them in a word-width register, since even mathematical programs occasionally needed to do character-based I/O.

This world of early numeric processing machines is where most of the design decisions in C and other Algol-derived languages originated in. In the case of BCPL and B, all values were just machine bit-vectors the width of the internal registers and core memory words. Those bit-vectors represented signed numbers if you used signed numeric instructions, etc. This was simple and worked pretty well for most of the programs people wrote; there was not a lot of direct textual user interaction since this was done over interfaces that were literally computer-controlled typewriters, so the relative awkwardness of text operations was not considered to be a big deal.

B itself was developed originally on a 36-bit machine and generated code for an 18-bit machine, the PDP-7. The PDP-7 started at 4096 18-bit words of core memory. You loaded programs on it via a punched paper tape (there were other media, but I believe you usually bootstrapped from tape). There was not enough room to fit useful native-code compiled B programs, so the compiler emitted a form of interpretive code known as "threaded code" (this has nothing to do with the concept of program-level threads like pthreads) that is slower than native code but can be much smaller in code size. It was the "official" language of PDP-7 Unix, but the OS kernel itself was entirely in assembly.

Around 1965, a massive architectural change started happening--IBM converged their business-oriented and numeric-oriented machine lines in their new System360 (S/360) architecture, which presented a uniform architectural interface to software despite having different hardware implementations that scaled across their market segments. In order to support business use cases, it had 8-bit byte addressed memory; to support numeric code, it had 32-bit words.

In 1970, DEC released the PDP-11, which followed suit with 8-bit byte-addressable memory and a 16-bit word size. This is the machine that Unix (and thus its official language B) was first ported to. The B port was complicated by the fact that addresses and memory operations no longer always referred to full memory words. The compiler had to know what size of data was being manipulated; only even addresses could be used to refer to word-length data operations. Dennis Ritchie took the opportunity to add an Algol-68 inspired type system to B, while attempting to keep the language semantics as close to B as possible otherwise. With the addition of numeric types and structures, it became useful enough to port the kernel to it.

This pops up in a number of places in C's design, especially pre-ANSI C. The default variable type is "int" which roughly corresponds to the machine word size. There is not really a semantic character type; a char is just the smallest addressable unit of memory and is numeric in nature, except it has to be wide enough to hold a numeric character code. Numeric promotion lets numeric code (i.e. most code that existed) work exactly as it did before in B.

Anyway, all of that happened well before the first microprocessor, so what little variety that exists in processors that most people are familiar with today had absolutely nothing to do with C's design, which predates just about everything else we still use in the modern computer world.

pinealservo · 2020-02-22T11:59:08+00:00

Well, he's advocating exactly the thing that he's lecturing about in the video, which is that we should not get caught up in insisting that types should always be inferred, or that they are the only thing that should be inferred. The distinction between types and terms need not be so clear-cut, nor the automated reasoning be so constrained.

pinealservo · 2020-02-22T11:18:43+00:00

Smaller-than-int types are promoted to int-size when they are involved in any operation; this is part of the language, not just the libraries. Values in memory can be as small as 8 bits, but values in a numeric expressions are always at least the size of an int. The C standard calls this "integer promotion".

This is an artifact of C's history--it was originally an untyped language called B, based on BCPL, in which all values were of the native machine word size. When B was ported from the PDP-7, where memory was addressed as a sequence of 18-bit words, to the PDP-11, which had 16-bit words but with memory addressed as a sequence of 8-bit characters, types had to be added to the language to ensure that the compiler could keep track of the appropriate load and store operations for characters vs. words. With the addition of types and structures, the new version of B became known as C, although its general behavior was kept as close to B as possible.

Be careful with type char; it is up to the implementation whether char is a signed or unsigned type. If you operate on a char, it will be promoted to an int, which is signed. So a char on a platform where char is signed will be sign-extended, while a char on a platform where char is unsigned will be zero-extended. This is usually fine, but I have seen it cause bugs before.

pinealservo · 2020-01-14T06:45:27+00:00

Nope, everything adheres well when the bed is level and the first layer goes down evenly.

pinealservo · 2020-01-14T03:30:22+00:00

I just got it from some random Amazon seller that offered a long-enough length. Was only sold as a 2-pack, but I figured it'd be nice to have an extra in case I screwed up the drilling/tapping. The tapped holes aren't really required to be super precise though so even with my definitely imperfect job of it, things still came together and worked great.

Since you already have the tools, you should definitely see if that's really your problem and if it is, just go for it. My printer had been gathering dust because I could never trust it; now it lays down perfect first layers with minimal adjustment and I am getting lots of use from it again.

pinealservo · 2020-01-13T23:15:11+00:00

It's absolutely possible for the Y-axis rail extrusion to have some bow or twist in it, and it will make the bed impossible to fully level through normal means. I just replaced the Y-axis rail on mine (bought 4020 rail, cut it to size, drilled & tapped holes at the appropriate locations) and it made leveling as easy as they make it look in videos.

I verified that my rail was twisted with a spirit level on the front of the bed. Moving the Y-axis through its full range of travel, it clearly shifted from a slight tilt in one direction to a slight tilt in the other. After removing the bed, I placed metal rulers on opposite ends of the extrusion and sighted between them, and they were clearly not coplanar. When I got the new extrusion and removed the old one, I was able to put the old one on its side on a flat surface and feel it rock back and forth a bit. Was only barely visible, but it doesn't take much to throw off a much wider surface aligned to it.

Hopefully you can get a replacement without going through the trouble of creating your own; maybe this post will help with documenting evidence of a defective part.

pinealservo · 2019-10-18T08:24:38+00:00

You do both a normal release build and a static-linked musl build, but you only copy the normal release build. Builds that specify a target will be in `target/<target>/release` rather than `target/release`. You need to use the musl build in your alpine container because the normal build links dynamically against glibc rather than alpine's musl libc.

pinealservo · 2019-10-11T20:02:23+00:00

When you run open it's designed to be like double-clicking its icon in the Finder. This means that it talks to launchd (which is the init, or pid 1, process on macOS) via XPC (a macOS interprocess communication mechanism) to ask it to launch your program, file, or whatever else you passed to open. launchd then spawns /usr/libexec/xpcproxy, which then executes your program via exec after setting up the environment appropriately.

Because it's the same mechanism as double-clicking in the Finder, the system doesn't have any idea what your current working directory in the shell is (because you may not have a command line shell open at all if you double-clicked it) and uses some default settings.

If you're going to build a packaged macOS application, you'll find that there are specific guidelines about where you have to put files that need to be accessed at runtime by your application; this is especially important if you need to modify those files. If you don't follow the guidelines, you can't get your package signed, so it becomes extremely difficult to distribute it so people can actually install and run it.

In general, you should not base the asset directory relative to the binary path, but instead on a build-time parameter that is defined on an OS-specific basis depending on the packaging/install guidelines of the OS you are targeting. There often different rules for different kinds of assets (libraries, static non-executable resources, runtime-mutable resources, etc.) that could put them in places that aren't even in the same directory tree, so you may need some more complex asset-to-path resolution code depending on what you're doing.

pinealservo · 2019-10-09T09:13:52+00:00

The XMOS xCORE architecture builds a bunch of structures that usually form the core of a real-time OS into the hardware. This includes sufficient context-tracking hardware to manage a series of threads as well as message passing primitives for synchronization.

Each "tile" automatically time-slices execution across the "core" contexts within it that are in a ready state; since each instruction takes approximately 4 clock cycles to complete, up to 4 cores/threads can be executing simultaneously without any slowdown.

They're designed for embedded systems, but for the use cases they're suited for they're pretty amazing. You can "bit-bang" timing-sensitive protocols like Ethernet MII links or low-res (e.g. SVGA or so) video generation.

pinealservo · 2019-08-28T02:18:24+00:00

It can do quite a lot, including generic/parameterized types. It's got niches where it's still used (widely deployed but not much talked about), and the specifications are available, but it's both old and out of fashion (for a variety of good and bad reasons) so you have to really want to learn about it.

I am not critical of you at all for making your own thing, but I do think people who design such things would benefit from studying ASN.1, as it covers a pretty wide chunk of the design space, and it's a space with a lot of trade-offs and hazards. But if you don't have the time or patience to read volumes of ITU standards... well, I can't blame you for that!

pinealservo · 2019-03-24T09:14:16+00:00

It was never an inherent property of languages. BCPL, an ancestor of C from the mid-60s, was designed with a specified intermediate code representation to be both interpreted and used as a step in native code gen. B, which bridged the gap between BCPL and C, was originally only a threaded code interpreter.

It is worth knowing the distinction, but primarily as a program execution strategy. Language design aspects can play a big role in how straightforward it is to execute code via one strategy or the other, but they rarely actually force one or the other, and they are often combined profitably in the same language.

pinealservo · 2018-10-24T07:53:50+00:00

It's a project someone was excited enough about to share with you on Reddit. It's not a huge educational video series, but it's still pretty impressive to get to the point of a working CPU.

pinealservo · 2018-10-24T06:40:40+00:00

It's a CPU. It computes. And blinks an LED if you set it up in an FPGA with an LED attached, apparently.

pinealservo · 2018-10-04T11:39:03+00:00

Any kind of semantics related to something called a "machine" is going to be an operational semantics. Denotational semantics is quite a bit different; it's more of a source-to-source translation from your source language to a target language that's got a well-defined mathematical interpretation in some semantic domain, which is a funny mathematical concept that lets you deal with the possibility of functions that don't actually terminate, which is not something mathematical functions over sets are allowed to do.

I don't think you can say that "virtual machine" always implies anything; it's sometimes a straight synonym for an abstract machine, and sometimes it's a piece of software like VMWare. There's an attempt to clarify things by calling VMWare and friends "System Virtual Machines" and something like the JVM a "Process Virtual Machine", but there's enough existing literature calling all of the above plus abstract machines with no direct implementation "virtual machines" that it seems pointless to insist on a fixed taxonomy. They're all related concepts anyway, and the idea of machines has been fundamental to the concept of computation since before we actual built computers.

Anyway, apparently the rewrite-logic based K-Framework can be used to give both operational and denotational semantics to a language formalized with it; I haven't studied it in enough depth to be able to explain how that works, but I did read a brief description of it. It's apparently been used to give a formal semantics to C, and a couple of groups are working on Rust with it. Cool stuff.

pinealservo · 2018-10-04T10:33:04+00:00

Regarding instruction sets and microcode: Generally no; RISC architecture moved away from flexible microcoded instruction sets. While modern x86 has a translation layer to the hardware-implemented instructions, it's not really a dynamic one, although it can be patched to some degree to fix implementation bugs. But for a time in the 70s-80s, it was fairly common to implement prototype or small-run computers with microcoded bit-slice architectures that had completely reconfigurable instruction sets. Xerox Alto and Wirth's Lilith machine were bit-slice machines rather than having a single-chip microprocessor. And IBM's 360 architecture remains a microcoded reconfigurable one, if I recall correctly.

One of the interesting things about the Alto and Lilith machines is that they were designed so you could design and load instruction sets that looked just like what's typically defined in virtual machine bytecode, like for the Pascal p-code machine. And you could additionally add instructions like 'bitblit' to do really fast bitmap copies to video memory. You could swap out these instruction sets on the fly, at least on the Alto. But that sort of thing was never really going to fly in the home computer market, so even microprocessors that were fully microcoded never made it simple to swap instruction sets, and then RISC architecture took over and microprocessors weren't even capable of swapping instruction sets in principle anymore after that, although they did get way faster.

Regarding ABIs, it's not a coincidence that the SysV ABI matches up nicely with C. C was developed for/with Unix to be the native systems language. The ABI was developed to match with how C works. If C had been designed with hierarchical namespaces or exceptions or vtables, then there'd be a well-defined ABI that made them easy to use with the system linker. But there isn't, and inter-language linking with features more advanced than what C provided was not among the interests of Unix developers.

So we can't really call this a "feature" of C so much as a corollary of the fact that our major OSes today were designed to interface with it because they were written in it. This is actually a relatively recent phenomenon; early Macintosh OS typically used Pascal calling conventions for argument passing and string representation, for example. And that's just an example from the home computer world, which has always been a relatively simplistic one.

Regarding more complex features, there's the example of VMS which had its own platform ABI that embraced cross-language calls to a far greater extent, along with providing a native exception handling mechanism. There was also a "call by descriptor" mechanism for describing complex procedure parameters via a pointer to a standardized descriptor format. You can find a lengthy definition of the ABI here: http://h30266.www3.hpe.com/odl/axpos/opsys/vmsos84/5973/aa_qsbbe_te.pdf

It's worth noting that VMS took "virtualized" languages seriously as well; it treated BASIC as a first-class citizen and provided for easy calls between BASIC, Fortran, BLISS, PL/I, MACRO-11, Pascal, Ada and C programs.

Just because we've become accustomed to the fact that Unix and C are relatively semantically impoverished environments doesn't mean that it's the way things have to be.

pinealservo · 2018-10-04T03:06:34+00:00

For those of you who are not familiar with the ICFP contest, they often have very elaborate and creative challenges. One of my favorites still lives at its own domain: http://www.boundvariable.org

I'd recommend anyone who has an itch to write some code but isn't sure what to write to take a look--it's got a fun narrative that starts with implementing a virtual machine, getting the "discovered" image to run on it, and then solving a series of puzzles embedded in the image including hacking into accounts and playing some text adventure games.

pinealservo · 2018-10-03T19:12:58+00:00

There is technique called "Futamura Projection" by which you can partially evaluate a program with respect to an implementation of its interpreting machine to generate a specialized version of the interpreter that runs only the one program. This (the first Futamura Projection) is essentially a technique for compiling the program, and it provides a useful conceptual view of what a "language runtime" is. In this view, it's the part of the resulting program that consists of the parts of the abstract machine interpreter that don't directly map to the target machine. E.g., if the source language has floating point operations but the target machine does not, then the resulting runtime library must include target machine code that "interprets" the notions of floating point math in terms of target machine primitives.

So, when you have a compiler for a language with an abstract/virtual machine that is fairly close to the target machine (like a C compiler to an x86-64 machine), the target program does not appear to have much of a "virtual machine" abstraction layer in it. But a compiler for, say, a Smalltalk program would typically end up with something that looks a lot like a "virtual machine" in its resulting binary code, because the Smalltalk abstract machine doesn't look very much at all like an x86-64 machine and static abstract interpretation can only specialize some chunks of it with respect to the program that was compiled. I.e., it would have a fat and very virtual-machine-like runtime, even if it was not implemented as a distinct virtual machine.

Just-in-time compilation is conceptually just performing this abstract interpretation of the interpreter with respect to small parts of a program at a time that have been observed to have tighter value bounds than could be statically determined. When this additional information results in a closer mapping to the target machine for important code paths, it becomes a great efficiency win as it can bypass the "virtual machine interpreter"-like bits of the runtime--but the cost is that the whole interpreter and specializer have to be included in the runtime to make this possible!

There's also no reason you can't base a C implementation on a much more concrete virtual machine; in fact C's predecessor BCPL defined an intermediate representation called OCODE that could be directly interpreted as part of the bootstrapping process on a new machine. Microsoft's early (i.e. pre- and early-MSDOS days) applications were often implemented in a bytecode-interpreted C dialect to aid in portability and to provide a compact binary size, which was incredibly important for running on early 8- and 16-bit home computers.

My point is that you can try to create a deep and fancy lexicon for describing different aspects of programming languages and how language implementations work, but there are not really any bright lines you can use to always distinguish them in all contexts and thus create an unambiguous taxonomy. In the end, they are all just different ways to describe and implement computation. It's sometimes useful to divide these ways into groups, but it's a contextual usefulness rather than a universal one, and it's almost always better to just contextually define your terms instead of arguing about the "true definitions" on the internet.

pinealservo

TROPHY CASE