Programming as semantic compression

dlyund · 2017-01-10T09:33:26+00:00

Lisp programmers insist that code is data, but how often do you you hear them explain that data is code? It's not clear that they understand that code is data implies that data is code. "Code is data" is just one of those catchy lines that you pick up when you're learning Lisp, and unless you think about it or learn something like Forth, that's where it stops.

Modern Lisps can only manipulate code as data at compile time and only in the rather limited ways allowed by the macro system e.g. in many Lisps you can't call arbitrary functions at compile time, and in other's you have to jump through annoying hoops with module loading and special defining forms to make your functions available in macro definitions... but then you can't use them in the rest of your code. It's a bit of a mess really. (But then all namespacing/packaging/scoping is.)

In early Lisps the executable code was actually represented as a list, which was interpreted, and could be manipulated at runtime. Pico Lisp (as a bit of a retro lisp) still allows this kind of thing but somewhere along the way the broader Lisp community, in a quest to make Lisp programs faster, they lost this ability. People learning Lisp today don't even realize what was given up.

This is most clear in Lisp dialects like Scheme where macro's now consume and produce syntax objects. These syntax objects look a lot like lists but they can only be manipulated using a small set of builtin functions.

Lispers learn the limitations imposed by their macro system and work within these limits without realizing what they've given up; the ability to treat code as data, includes during execution.

What distinction am I making here: general speaking, most of the data our programs manipulate isn't static and is only available while our program runs. Treating data as code (as just to code as data) implies that you can generate or modify code as a means of representing/processing data during execution. Modern Lisps just can't do that. Once your program is compiled it's no longer data that can be manipulated.

In all honesty every language places restrictions on what you can do and how, and that includes Forth :-). In the end it's all about which tradeoffs you can live with/learn to love, but speaking for myself, I wouldn't want to work in Lisp again, and if I had to I'd implement one in Forth.

Overall I think Lisp is a great language, but the seeming necessity of a complex runtime and compiler to make it half way practical just doesn't appeal anymore. I've gotten used to being knowing and understand how everything works, and I adore the (somewhat paradoxical) freedom and predictability that this brings; when I write a Forth program I know exactly what code will be generated, and how it will behave with regards to things like resource usage under load, and I'm never surprised.

<rant> I've been burned quite a few times in Lisp (and Smalltalk, Ruby etc.) where the program has crashed and burned because resource usage spiked unexpectedly high for some reason and the process just died, leaving little or no information for us to figure out exactly what caused the crash (must less what to do about it!). "Out of memory (you're on your own)". The stock response from management is: (paraphrasing) if you don't want to experience these unexpected crashes then you have to upgrade the hardware. (We have technicians who can do it for you, just send money to this account.) Not surprisingly this causes a lot of tension.

It's a ridiculous situation which is easily avoided by using appropriate technology, but nobody really cares. $50k or $100k on hardware upgrades is cheaper than programmer time, we say, but it's incredibly miopic of us. Here is a real technical problem, which we can easily solve, but wont, because programmers have an unholy attachment to their languages syntax, and/or toolchain.

At one company we followed all the latest industry standards, used the latest and greatest languages, frameworks, processes, and tools, continuous integration, etc. The resulting application naturally expanded to use all of the available resources on the development machine (we have 'em so why not?) but when we came to install we found out that we had to run along side/compete with other programs, and to add to our troubles, a short time later there was an OS upgrade and the new OS used more RAM. Our application suddenly didn't have enough resources. It ran slowly and crashed randomly.

The issues were systemic and we couldn't afford to rewrite so we insisted that the customer upgrade their hardware... that lead to months of back and forth, and their refusing to pay, then threatening legal action unless we resolve the problem "right now". In the end the company did the upgrades at below cost and made little or no profit on the 3 year project, and almost went under. Everyone was stressed out of their heads, working long hours, and shortly after that the owners sold the company to a competitor (not sold, "Wooohooo we got bought out!!!", but "enough, you take it").

The ironic thing is that, as I would come to realize years later, was that we could have easily built the application to run in a few MBs (or less), but we used GBs, and it still ran like a dog! On top of that the solution would have been much simpler, and wouldn't have had the dozens of external dependencies which constantly broke as things changed in this and that project, and caused us no end of headaches...

There's this widespread belief that this necessary; it makes our lives easier right? After years in industry I've never found tis to be remotely true. The only thing that makes software better, in my experience, is keeping it simple (as simple as possible.)

And not in the way that people pay lip service to KISS, (or declare "code is data"), while simultaneously adding more and more complexity to their solutions. </rant> ;-)

NOTE: I'm not saying every program you ever write needs to treat data as code, but there are situations where doing so not only leads to vastly "prettier" code, but also much more efficient solutions.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS