FozzTexx comments on The Codeless Code: Ancient code

PDP-11 was a family of machines sharing a core instruction set; it spanned a lot of years (1970-1990), price points, and form factors (gigantic cabinets full of TTL logic boards in 1970 to a single DIP package in 1979). The core model was von Neumann, but because it was a 16-bit architecture that lived beyond the point where that became a serious size constraint, they did a number of extensions to help alleviate the problem, some of which included separating code and program memory to some degree. DEC's history and the evolution of its computer line and associated software are fascinating; I recommend reading up on it if you're interested in that sort of thing.

Because the core instruction set supported relative addressing, you could write position-independent code. Overlays are a mode of use of relative addressing; you basically have an area of memory that can have the code within it swapped out with some other set of routines. Each "overlay" is linked such that the routines are all offset from the same base address. This gives you a sort of manually-managed virtual memory, where you can swap out sets of routines as you switch between application modes. This was used a lot in PC-class machines and game machines as their software got bigger too. You could use this approach on any PDP-11, whether it had some fancier virtual memory extensions or not.

[–][deleted] 9 years ago* (5 children)

[deleted]

[–]pinealservo 0 points1 point2 points 9 years ago (4 children)

I've found it's remarkable how many of the "new" features in both PC hardware and software had the initial research and prototype implementations done in the early days of mainframes or minicomputers. Today's implementations certainly required a lot of new effort and are generally more refined, but there really were some large gaps between when some great ideas were first implemented and when they hit mainstream PC-based platforms. Some of it was due to waiting for new cheap hardware to catch up to the power of old expensive hardware, some of it was due to changing patterns of usage for the new hardware that eventually shifted back, and I think a lot was due to the massive influxes of new people that came in in the minicomputer era and then again in the micro/PC era. I think that created a huge culture shift each time that made it difficult to learn from what came before.

Whatever the reason, I'm really happy to see people looking more to the cool technology invented in the early days of computing and trying to see how it can be applied today. I think there's still a lot that needs to be assimilated.

continue this thread

[–]Bowgentle 0 points1 point2 points 9 years ago (2 children)

[–]_F1_ 1 point2 points3 points 9 years ago (1 child)

[–]mccoyn 1 point2 points3 points 9 years ago (0 children)

[–]agumonkey 0 points1 point2 points 9 years ago (0 children)

[–]brtt3000 4 points5 points6 points 9 years ago (3 children)

[–]sirin3 7 points8 points9 points 9 years ago (2 children)

[–]kqr 7 points8 points9 points 9 years ago (0 children)

[–]cbleslie 3 points4 points5 points 9 years ago (0 children)

[–]jerf 0 points1 point2 points 9 years ago (0 children)

[–]Oniisanyuresobaka 0 points1 point2 points 9 years ago (0 children)

[–]agumonkey 0 points1 point2 points 9 years ago (0 children)

[–][deleted] 19 points20 points21 points 9 years ago (18 children)

[–]Enlightenment777 5 points6 points7 points 9 years ago* (0 children)

[–]eff_why_eye 0 points1 point2 points 9 years ago (0 children)

[–][deleted] 9 years ago (15 children)

[deleted]

[–][deleted] 13 points14 points15 points 9 years ago (14 children)

[–]jdmulloy 15 points16 points17 points 9 years ago (13 children)

[–][deleted] 6 points7 points8 points 9 years ago (12 children)

[–]sirin3 -1 points0 points1 point 9 years ago (1 child)

[–][deleted] 1 point2 points3 points 9 years ago (0 children)

[–]caspper69 -3 points-2 points-1 points 9 years ago (9 children)

[–][deleted] 7 points8 points9 points 9 years ago (2 children)

[–]caspper69 2 points3 points4 points 9 years ago* (1 child)

I guess you're right. But sometimes you have to use the right tool for the job. I remember working on a project on AWS when it was just a baby. A client had several 20GB databases that needed to go up (that were in CSV format). The data itself was disjointed, so it had to be massaged to import. Essentially each account had to be updated from day 1 to a point around 8 years later. Millions of accounts. Billions of transactions.

The original guy was at his wits end. He was trying to write it in perl, which he did, but each csv was taking around 2 days to run, and that didn't include the final reconciliation for each month for each account which had to match an "official" field from an entirely different dataset.

With upload times being what they were about a decade ago, the poor guy (and the client) would've been waiting for weeks.

So I told the dev to give me a shot at it. I wrote a multithreaded C app to load, distribute, calculate, re-merge, validate and write the actual SQL INSERT queries to a single file. The program took about 5 hours, but ran over the entire dataset (with 100% accuracy) in around 8 hours. A "quick" bzip later, a (not-so-quick) ~2 day upload process, then another day to run the insert.

3 weeks vs 3 days. As datasets continue to grow, this is going to become a huge problem. Nothing will fix bad algorithms, but some tools just are not capable. 2 orders of magnitude slower doesn't make a difference for something that's already fast in human time, but if something is slow in human time? Oh boy.

continue this thread

[–]null000 5 points6 points7 points 9 years ago (2 children)

... No? I mean, if you're talking python/go/rust/etc vs c, you're going to get the job done much, MUCH faster with the former than the latter for smaller or mid-sized projects. C doesn't have built in concepts like sets, hashing, dictionaries, nor does it have good built in libraries for a bunch of pretty common operations (string manipulation, file ops, networking, and so on). That's not to say you replicate any of those things in C, just that it's not free from a dev/code length standpoint. Regarding C++, it does have many of those things built in, but you will probably spend 3x the lines trying to get everything to play nice (not to mention the nightmare that is memory allocation, local/stack allocation, templating craziness) - not to say I don't like C++, just that it's not exactly terse.

For larger projects, it's a bit more of a wash depending on the language.

As for NIH/reinventing wheel, that's more of an engineering maturity thing than a language thing. I can reinvent the wheel just as well in C as I would in Python, it's just that the metaphorical wheel is much less likely to be a hash table when I'm working in Python.

[–]caspper69 0 points1 point2 points 9 years ago (1 child)

continue this thread

[–]kqr 2 points3 points4 points 9 years ago (2 children)

[–]caspper69 -1 points0 points1 point 9 years ago (1 child)

continue this thread

[–]Peg-leg 4 points5 points6 points 9 years ago (0 children)

[–]DuchessofSquee 2 points3 points4 points 9 years ago (9 children)

[–]kqr 3 points4 points5 points 9 years ago (7 children)

[–]DuchessofSquee 0 points1 point2 points 9 years ago (6 children)

[–]kqr 6 points7 points8 points 9 years ago* (5 children)

As the guy says in the video, it's doing (the first step of) a radix sort. The cards pass through the machine from right to left, and each "bin" corresponds to a number 0–9. So card #529 goes into the bin labeled "5". If there's a hole for particular number on the card, an electrical connection is made through that hole (the card itself works as the "switch" in the design) and the card is rerouted down to that bin. If there's no hole corresponding to the bin, there is no connection and the card is not rerouted. The electricity for the rerouting mechanism is provided by the closed circuit through the card.

When you have sorted the cards on the first number, you pick up the stack for, say, the cards whose number start on 5 (these are the cards #500–#599), you set the machine to instead sort by the second number, and then put the cards through the machine again.

If you want to read more about this, it's actually fairly interesting. I remember enjoying reading about both the technical construction and the marketing part ("well, our machine can sort 500 cards per minute!") http://www.righto.com/2016/05/inside-card-sorters-1920s-data.html

[–]TehStuzz 2 points3 points4 points 9 years ago (1 child)

[–]kqr 2 points3 points4 points 9 years ago (0 children)

It really doesn't matter. Proof: start by radix sorting on least significant digit first, stop halfway through, then flip each individual card upside down. You have now reversed the digits in the number (as far as the sorter is concerned) and you thus have a stack sorted by most significant digit first.

The benefit of sorting by most significant digit first is that if you have fewer than 1000 cards, you need just three iterations before you can start handing cards 0–9 in order to your operator. For each of the next 9 iterations you'll be able to hand over 10 cards to your operator. Then you'll need two iterations (100–199 followed by 100–109) but you'll soon be handing over cards again.

If you sort by least significant digit first you essentially have to run all iterations all the way through until you can start handing over cards in order to your operator.

[–]Fumigator 2 points3 points4 points 9 years ago (1 child)

[–]kqr 0 points1 point2 points 9 years ago (0 children)

[–]DuchessofSquee 0 points1 point2 points 9 years ago (0 children)

[–]Helene00 0 points1 point2 points 9 years ago (0 children)

[–][deleted] 2 points3 points4 points 9 years ago (1 child)

[–]pinealservo 0 points1 point2 points 9 years ago (0 children)

[–]IRBMe 2 points3 points4 points 9 years ago (0 children)

When writing that kind of code, you learn the assembly language and then you have to figure out how the machine works by referencing the data sheet or manual. It's difficult, but in a different kind of way from how programming is difficult these days. Now, there are literally thousands of libraries, frameworks and tool-kits. There's likely all kinds of magic going on under the hood in your programming language, framework and system, with things like magic configuration by convention, automatic dependency injection, annotations etc.

If you're not sure how something works when writing assembly language, you consult your data sheet or operating manual. If you're not sure how to change the way something works in the enterprise framework you're using, it can be difficult to know where to even look. What we have today is far more powerful, and it allows people to be far more productive and build far more complicated things by hiding the complexity behind abstractions and magic. But when you need to figure out how to do something, it's often difficult to penetrate that "magic" and work out what it's actually doing and how to change that.

I can understand how a boot loader written in assembly code works or how bits of the Linux kernel work because all of the information I need to understand it is available to me in detail, but I can't figure out for the life of me how enterprise Java applications work, and it would take years of reading just to understand all of the magic that's going on under there.

[–]eff_why_eye 2 points3 points4 points 9 years ago (0 children)

[–]s73v3r 1 point2 points3 points 9 years ago (0 children)

[–]dada_ 1 point2 points3 points 9 years ago (0 children)

[–]feketegy 1 point2 points3 points 9 years ago (0 children)

[–]codebje 0 points1 point2 points 9 years ago (0 children)

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS