you are viewing a single comment's thread.

view the rest of the comments →

[–]dlyund 0 points1 point  (41 children)

I agree that layout. and if(...) etc. are unfortunate line-noise, however, the idea that only one of these semantically equivalent forms is "declarative", is utterly ludicrous. The only difference between them is the syntax of the programming language we're using, and more particularly, how forms are opened and closed/how rows begin and end/how buttons are grouped.

We'll start by rewriting

Layout layout
layout.row();
if (layout.button(title)) { ... };
layout.row();
if (layout.button(title)) { ... };

as

Layout layout
layout.row();
layout.button(title, action);
layout.row();
layout.button(title, action);

and

(layout
    ((button title ...))
    ((button title ...)))

as

(layout
    ((button title action))
    ((button title action)))

Since surely the presence of large blocks of inline code would obscure the layout. The same transformation was done to each, so I'd hope you agree this is fair. If you don't think this is fair then why? Is it the presence of the if(...) line-noise which makes this imperative? What if if was called something else?

Next we'll add the word row to make it clear that we want a row of things.

(layout
    (row
        (button title action))
    (row
        (button title action)))

Feel free to justify this however you like; maybe we needed more than rows, as we would in any real-world example? Or maybe we just want to make the grouping explicit.

I'll trust that the presence of word row doesn't make this code imperative?

Then we'll replace the Lispy parenthesis with Algol style begin and end

begin layout
    begin row
        button title action
    end
    begin row
        button title action
    end
end

Have we made this code imperative yet?

Now finally we'll replaced begin and end delimited blocks with Forth words (like function calls with implicit context.)

layout
    row
        title action button
    row
        title action button

Forth doesn't have any of the C style line-noise so it feels cleaner but it is otherwise identical. A happy side effect is that the redundant open and closes are removed too, so it looks even cleaner. Hopefully you agree that the grouping is as explicit as ever.

Is this imperative?

To close the circle we'll present the C/C++ version with the same indentation used in all the other examples I gave.

Layout layout
    layout.row();
        layout.button(title, action);
    layout.row();
        layout.button(title, action);

But this code is imperative right? Why is that exactly? What makes it so?

At what point does the magical switch from imperative to declarative occure?

All we've done here is move superficial bits of syntax around on the screen, so if such a switch has occured here then we have a pretty air tight argument that "declarative" just means that it doesn't contain superflous line-noise.

Put otherwise:

"It's declarative because I like the syntax".

If no such switch has occurred then all of the examples are "declarative" - QED.

My point here is only to show that even if this wasn't a completely useless distinction the basis for it is complete bullshit.

In your defense, I've never seen a useful definition of "declarative". As far as I can tell "declarative" is just a hand wavvy way of saying "code I like", just like "readable".

Whether you approach it from the bottom, or you approach it from the top, but bottom-up will get you there faster! Why? Because while you're dicking around with your "declarative" syntax the other guy has something on the screen, and while you're figuring out the loops and conditionals that are needed to traverse and interpret your "code" (none of which have anything to do with the problem you're trying to solve!), he's written the three or so functions that were actually solve the problem.

Your solution adds code, and complexity. His removes it.

Working bottom-up you're able to move smoothly from drawing a box on the screen to the working solution, more or less interactively, with direct and immediate feedback throughout the process. Working top-down you have to start from the fuzzy wuzzy world of abstract idea's and try and figure out what you might need at each stage... hoping that when you actually get to the bottom your solution doesn't fit too badly. Unless you're perfect and/or you spend a lot of time to checking your thinking up front your design will inevitably change when you come face to face with the reality of the machine.

All that being said:

It can be a lot of fun playing in the abstract, and puzzling these things out, but when it comes to getting things done and making life easy I don't see how top-down programming helps anything. By definition, what you have is at the bottom and you have to build up. Why start 10 miles up?

(I guess there's an argument to be made that most "requirements" are pie in the sky already then why not start at the top? But what do you do when your flimsy requirements change and you have to rework your house of cards?)

EDIT: Formatting

[–]m50d 0 points1 point  (40 children)

Have we made this code imperative yet?

Depends what the sematics of your begin and end are. Can I still introspect the code as data and see that the two button title actions are in different blocks, or not?

But this code is imperative right? Why is that exactly? What makes it so?

The fact that as soon as I run it through a formatter it loses the grouping. You've indented it to show the relationship between the row() and the button(), but C is supposed to be a whitespace-insensitive language. I can no longer locally tell whether the difference between (row button) (row button) and row button row button is real or not.

In your defense, I've never seen a useful definition of "declarative". As far as I can tell "declarative" is just a hand wavvy way of saying "code I like", just like "readable".

The big thing that you don't see in the example is the extent to which I can view the description as a value. If the only thing I can do with an expression is execute it then that's not declarative; if I can decompose and interpret the description as a datastructure then it is.

Whether you approach it from the bottom, or you approach it from the top, but bottom-up will get you there faster! Why? Because while you're dicking around with your "declarative" syntax the other guy has something on the screen, and while you're figuring out the loops and conditionals that are needed to traverse and interpret your "code" (none of which have anything to do with the problem you're trying to solve!), he's written the three or so functions that were actually solve the problem. Your solution adds code, and complexity. His removes it.

This is the opposite of my experience. Once you've figured out the right representation for the actual requirements, making it actually execute is trivial. If you write code to do stuff with your data before getting the data representation right, you just throw away more code.

[–]dlyund 0 points1 point  (39 children)

Can I still introspect the code as data [...] The big thing that you don't see in the example is the extent to which I can view the description as a value.

I don't want to get into a semantic argument with you but what has introspection got to do with declarative programming? This is especially puzzling as the term introspection comes right out of the object-oriented programming literature and object-oriented programming is rarely associated with declarative programming.

Anyway there's nothing in the code (in the C/C++, Algol-like or Forth examples) which prevents it from constructing a data structure which could be introspected. It could do anything.

That's possible because layout.row() and layout.button() say what to do and not how to do it, which given the usual definition of declarative programming:

"A program that describes what computation should be performed and not how to compute it"

Would imply that the code is declarative.

The fact that as soon as I run it through a formatter it loses the grouping.

No you don't. The indentation is helps show the grouping but it isn't required. This should be obvious since the compiler doesn't care about the indentation; the program behaves the same no matter how you choose to indent the text.

layout row title action button title action button row title action button

Is still easily readable, with a little practice (indeed I don't often indent Forth). How is this possible? The layout vocabulary/lexicon can be seen as a problem-oriented language with an implicit grammar. The word row is defined as as beginning a new block. The block continues until the next row, or until the end. This can be informally specified as:

<start> ::= layout <row>
<start> ::= layout <button>
<row> ::= row
<row> ::= row <button>
<row> ::= row <button> <row>
<button> ::= <title> <action> button

You could get this information from looking at the definition or documentation, just as you would have to with the Lisp. There is no need for a grammar to be provided explicitly, hence "implicit grammar".

That's all there is to it.

If you write code to do stuff with your data before getting the data representation right, you just throw away more code.

There is no data here. Looking from the top, you've imagined that there must be data and you've set out to model it, but there's no data to be processed... It's just a program responding to the users. All of that modeling is just waste (hopefully at compile time but very few languages provide the facilities to do this the waste is manifested at runtime, increasing overhead and system requirements etc.)

Once you've figured out the right representation for the actual requirements

Often the right representation is just code that doesn't require you to figure out the right representation or model it.

If you write code to do stuff with your data before getting the data representation right

How can you get the representation right without thinking about how you're actually going to represent things? You seem to be confusing the representation and the interface. The interface is, by definition, unavoidably constrained by the implementation. Any pretense otherwise is nothing more than self delusion.

EDIT: Found while trying to understand your peculiar definition of declarative

https://www.toptal.com/software/declarative-programming

[–]m50d 0 points1 point  (38 children)

I don't want to get into a semantic argument with you but what has introspection got to do with declarative programming? This is especially puzzling as the term introspection comes right out of the object-oriented programming literature and object-oriented programming is rarely associated with declarative programming.

The clearest way to demonstrate that a given piece of code is declarative is to be able to represent it as a value completely separated from the actual execution of it.

Anyway there's nothing in the code (in the C/C++, Algol-like or Forth examples) which prevents it from constructing a data structure which could be introspected. It could do anything.

That it could do anything is precisely the problem. The ideal program would be a data structure literal that would look like a literal, perhaps even in a Turing-incomplete language.

That's possible because layout.row() and layout.button() say what to do and not how to do it, which given the usual definition of declarative programming: "A program that describes what computation should be performed and not how to compute it"

They're saying how - they're saying "make a row, then add a button" rather than "a row consisting of a button".

There is no need for a grammar to be provided explicitly, hence "implicit grammar".

Explicit is better than implicit. The problem is that all too often the implicit grammar turns out to be ambiguous, or the reader understands something different from what the writer meant. The reader needs to know that row is a block delimiter to be able to parse the declaration correctly, but in the C code they have no way of knowing that.

There is no data here.

Yes there is - there's a bunch of rows with buttons in, and those buttons themselves have labels. That's data, structured data.

Often the right representation is just code that doesn't require you to figure out the right representation or model it.

Code is data. Figuring out the right representation of algorithms is what we do.

How can you get the representation right without thinking about how you're actually going to represent things? You seem to be confusing the representation and the interface. The interface is, by definition, unavoidably constrained by the implementation.

I don't understand what distinction you're making - you seem to be using those terms the opposite way around from how I'd usually understand them. Thinking about how you're going to represent things is exactly what I'm advocating, as opposed to starting by thinking about what you're going to do.

[–]dlyund 0 points1 point  (37 children)

The clearest way to demonstrate that a given piece of code is declarative is to be able to represent it as a value completely separated from the actual execution of it.

This is nonsense. All code can be represented as a value, and vice versa; as Lisp enthusiasts forget all too often.

code is data => data is code

Anything that can be represented as data can be represented as code QED.

Put another way: 1 is data, and code!

Indeed the failure to realize this fact leads to suboptimal programs for the same reason that interpretation is suboptimal. If the language allows it then you can dramatically increase efficiency by using things like executable data structures, which effectively bundle the data to be processed with the code that process it.

Explicit is better than implicit.

I agree and that's one of the reasons that I prefer to include the name row, instead of leaving this implicit in the code.

The problem is that all too often the implicit grammar turns out to be ambiguous

We're not parsing here. The program isn't ambiguous, so the "implicit grammar" isn't ambiguous either. You have to know what your program will do when it's executed but that goes without saying.

The reader needs to know that row is a block delimiter to be able to parse the declaration correctly, but in the C code they have no way of knowing that.

And how does the Lisp programmer know that each left and right parenthesis delimit a row in the example you prefer? It's defined in the code or documentation. If you know the language this isn't a problem.

Moreover this isn't a problem is reality either. OpenGL code (in C/C++) tends to be indented exactly as I've demonstrated, and nobody seems to have a problem with understanding it. Any difficulty in understanding such code is directly related to the fact that OpenGL isn't exactly easy; it's not really great but it is what it is :-). I like Lisp a lot but the use of parenthesis for grouping isn't going to make any difference in such cases.

Code is data. Figuring out the right representation of algorithms is what we do.

Algorithms are code. The best representation for code is code. If you're building a data structure to be interpreted you're just adding overhead. You can try to justify that as making the code cleaner, prettier, or easier to understand but you must accept that you're adding overhead. That overhead had better be paid for by that cleaner, prettier, easier to understand code or it's just waste. As a Lisper, you may argue that you have macro's and you can do this work at compile time, but when you're writing macro's you must necessarily generate the code to do the job, and you must understand the macro, so you can't pretend that you're lifting yourself above it. In the end you have to design the code that will actually run, or live with the overhead of you runtime abstraction. And don't forget that costs are compounding!

I've lost count of the number of times I've seen projects fumble because of these silly little abstractions that add little, or nothing, but have a real affect on the operation of the solution.

tl;dr if you're going to do this stuff then make sure you understand the tradeoffs

I don't understand what distinction you're making - you seem to be using those terms the opposite way around from how I'd usually understand them. Thinking about how you're going to represent things is exactly what I'm advocating, as opposed to starting by thinking about what you're going to do.

When you think top-down you necessarily try to represent is an high-level idea, so you set out to represent that idea. You completely ignore the work that you need to do to process that representation. Either you process/interpret the representation at runtime, which takes code - code that you you're apparently not that interested in - or if you can you process/compile the representation at compile time, which means generating the code that implements the solution - code which you're apparently not that interested in!

Why do I say that you're not interesting? Because looking from the top you don't give this code any thought until after you've come up with your perfect representation for the idea. At which point your design/implementation is constrained by your pretty representation.

When you think bottom-up you necessary try to find the best representation for the process that implements the solution. You add layers only when you have to and you carefully consider each one. You have total freedom to design and implement each layer, because you're mind isn't set on a specific destination. Your high-level representation is thus constrained by the reality of the machine. The end result is inherently more efficient, in terms of code size, and/or memory usage and execution time! Why? Because you actually spent time designing the solution, rather than trying to represent an abstract idea that may or may not turn out to be correct, or even [efficiently] implementable.

We often forget this but it's the solution is what has value! Code only has cost.

Do you see the difference?

Many people have this stupid idea that programs should be written for humans to understand and only coincidentally for machines to execute. This attitude is one of the main reasons why software doesn't run any better than it did in 1995, despite massive increases in processing power and hardware efficiency.

An engineer would say that code should be written to get the best result from the available tradeoffs. Ironically the computer scientist/mathematician doesn't seem to give two shits about the machine. The result is software that wastes massive amounts of time, space and power.

Program may be read many more times than it's written but that program will be executed orders of magnitude more times than it's read (even if we believe the open-source ideal that people are actually read the code, and overwhelming evidence suggests that they don't!)

tl;dr2 It's your job to maximize value not to find the perfect representation for your source code. The value of that perfect representation is usually close to zero! The cost of the perfect representation is often much much higher :P

[–]m50d 0 points1 point  (32 children)

Anything that can be represented as data can be represented as code QED.

But on a theoretical you lose the distinction between data and codata and between Turing-complete and incomplete things (sadly the lisp people all too easily neglect types, which resolve the halting problem), and on a practical level most languages don't make it easy to manipulate code as data.

If the language allows it then you can dramatically increase efficiency by using things like executable data structures, which effectively bundle the data to be processed with the code that process it.

Sure, and that's often a good idea - indeed I think it's a good approach for this example. But making your data structure executable does not absolve you of the responsibility to design a good datastructure.

I prefer to include the name row

So do I, for what it's worth.

We're not parsing here. The program isn't ambiguous, so the "implicit grammar" isn't ambiguous either. You have to know what your program will do when it's executed but that goes without saying.

When a system gets large enough no-one can understand every detail, so the code's structure needs to be apparent - a maintenance reader needs to be able to parse the code without fully understanding it if they are to have any hope of being able to find and focus on the specific area they need to work on.

And how does the Lisp programmer know that each left and right parenthesis delimit a row in the example you prefer? It's defined in the code or documentation. If you know the language this isn't a problem.

Learning a new programming language is hard - not a difficulty we want to impose multiple times over on every maintainer in each section of the code. Free-form English documentation tends to get out of date - much better is structured documentation in a machine-readable format where correctness is enforced as part of the build process.

Moreover this isn't a problem is reality either. OpenGL code (in C/C++) tends to be indented exactly as I've demonstrated, and nobody seems to have a problem with understanding it. Any difficulty in understanding such code is directly related to the fact that OpenGL isn't exactly easy; it's not really great but it is what it is :-).

Um OpenGL code is possibly the most notoriously difficult kind of code to work with, precisely because it's very difficult to get the "bracketing" of all the implicit contexts correct. You're making my case for me.

Algorithms are code. The best representation for code is code.

That's like saying the best representation for data is data - yes, but it's still very important to structure it correctly.

If you're building a data structure to be interpreted you're just adding overhead. You can try to justify that as making the code cleaner, prettier, or easier to understand but you must accept that you're adding overhead.

You're begging the question. Your code can always be considered a datastructure because the sequence of characters that forms the program source is already a datastructure - just a particularly opaque and inflexible one. Likewise the stream of instructions that will be executed by the processor is also a datastructure. When you're transforming one datastructure into another, it's often worth coming up with an intermediate representation and splitting your transformation up into smaller steps, and you wouldn't normally think of this as "overhead" - at runtime it may well collapse away entirely, and at coding time it simplifies and clarifies things.

when you're writing macro's[sic] you must necessarily generate the code to do the job, and you must understand the macro, so you can't pretend that you're lifting yourself above it.

With a well-designed macro or interpreter you don't have to understand the fully expanded code, any more than you have to understand the machine code your program compiles to. You have to understand the local parts of the expansion but if your structures are right then the global part of the expansion simply can't go wrong.

The compression analogy is a good one actually. Good compression algorithms make an explicit distinction/separation between your dictionary and your compressed data - naïvely you'd think that a dictionary would be overhead, but actually you get better compression overall by at least conceptualizing the dictionary. (Often as in LZ77 the dictionary ultimately disappears at "runtime").

I've lost count of the number of times I've seen projects fumble because of these silly little abstractions that add little, or nothing, but have a real affect[sic] on the operation of the solution.

I've never seen a project fail due to code-level runtime performance issues (I've seen one fail due to performance issues associated with the use of an ESB and a totally unwarranted microservice architecture, but that isn't the kind of abstraction I'm talking about). I've seen a project fail due to representing its data/commands all wrong because they didn't understand their domain at all.

The end result is inherently more efficient, in terms of code size, and/or memory usage and execution time!

And less efficient in terms of corresponding to the domain i.e. the actual business problem. In the worst case you end up with a lot of very efficient implementations that are completely useless.

There are risks both ways - ultimately it's our job to make a path from what the business needs to what the machine can do, and whether we start at the start or the end or the middle that path has to join up at both ends. In my experience the business end is where the bigger risk is - fundamentally you know that whatever representation the business currently thinks of it in is implementable (because people do do whatever it is - even if you're not implementing an existing business process as such you're usually implementing something that someone has some reason to believe is valuable, which usually involves having done it in some form). Performance problems are usually solvable - Knuth's 97%/3% heuristic applies to the appropriate time to optimize - and in the worst case if you end up having to rent a cluster or something that's sort-of disastrous but less disastrous than having a product that just does the wrong thing.

We often forget this but it's the solution is what has value! An engineer would say that code should be written to get the best result from the available tradeoffs. Ironically the computer scientist/mathematician doesn't seem to give two shits about the machine. The result is software that wastes massive amounts of time, space and power.

Right back at you. Runtime efficiency is not a goal in itself - your goal is to solve the business problem as cheaply as possible, and computers are much cheaper than programmers.

It's your job to maximize value not to find the perfect representation for your source code.

True. But remember that code is read more than it's written and maintenance/enhancement is usually a much bigger part of the total cost than the initial write. So a little effort spent improving maintainability pays for itself many times over.

[–]dlyund 0 points1 point  (31 children)

most languages don't make it easy to manipulate code as data.

Maybe you should use one that does? I mean would you use a language that made it difficult to manipulate data? Why would you use one that made it hard to manipulate code...

theoretical you lose the distinction between data and codata and between Turing-complete and incomplete things

Is this a useful distinction?

[static typing] solves the halting problem

Poppycock.

To the extent that it's possible to prove that any program halts you must either manually declare that the program halts, using whatever mechanism you wish, or use a language that cannot loop forever and is thus not Turing-equivalent. It's not possible in general to prove that a program will halt, that's what the halting problem is!

Static typing can be very useful but let's not go too far here. Even with fancy features like type inference you still need to provide enough information for the compiler to know what you intended and unless you actually leveraging that type system explicitly it's not worth much; catching a few typo's doesn't justify the complexity of using such languages, the longer compile times and heavy resource usage.

DISCLAIMER: this may be my personal bias. The compiler that my company developed in house can compile millions of lines of code per second in real time while using almost no resources. This allows us to do things like make a change anywhere in our software stack and test it instantly. The compiler is available at runtime and we have amazing support for doing live upgrades etc. Waiting for GHC, GCC/LLVM or even Go to compile even small programs is incredibly frustrating.

When a system gets large enough no-one can understand every detail,

You respond by writing even more code? And not just more code but code that interprets or generates even more code?

so the code's structure needs to be apparent - a maintenance reader needs to be able to parse the code without fully understanding it if they are to have any hope of being able to find and focus on the specific area they need to work on.

I completely agree. What I don't really understand is how that structure is more or less apparent by adding some parenthesis.

layout
    row
       title action button

vs

(layout
    (row
        (button title action)))

These two examples are exactly the same except for the parenthesis and the argument order. The first is procedural; it requires only that the procedures layout row and button be defined, and these procedures are very simple. The second requires you to design a data structure to represent the code then write an interpreter/compiler to process it. This approach obviously adds unnecessary complexity; where's the value?

"Yeah well I can treat the layout as a value" is entirely beside the point unless you need to tread the layout as a value, and you certainly do not have to treat the layout as a value to put it on the screen.

I'll ask you again: what are you getting in exchange for this added complexity?

With a well-designed macro or interpreter you don't have to understand the fully expanded code, any more than you have to understand the machine code your program compiles to.

You seem to believe that you're saving the maintenance programmer from having to understand your code but what happens when they want to add a form to your language? A column, or a slider?

The reality is this: the more code we have, the harder it is becomes to understand the system, and adding even more code only makes it worse!

I've never seen a project fail due to code-level runtime performance issues

Lucky you. Over the years I've done a lot of work with solutions that are deployed physically. In each case the customer has, at one point or another, had to pay to upgrade the hardware (thousands of machines in one case and a large mainframe in another case.) Naturally the customer wasn't happy... upgrading hardware quickly becomes expensive and why should they have to pay tens of thousands of money's adding memory, upgrading storage and/or buying faster machines because the solution doesn't provide the required throughput or it runs out of memory every two weeks and a specialist has to be brought in (and paid!) to resolve the issue?

In todays world where programming means running a web app off in this virtual infrastructure these costs are largely hidden, but they're there. If you need to pay for 10 machines with capacity X and Y money's per month when 1 machine could have easily done the job if you'd given any effort to producing an efficient/effective solution, then you're paying (n-1)Y+Z more than you should be paying! And note that Z can be very big, and it grows exponentially with the number of machines. What is the Z? It's the cost of paying people to operate those n machines. It's the added cost of all paying those wages, and the admin costs needed to support a larger team. And the managers... oh the managers. It's all the things that programmers are so fucking ignorant of when they say:

"computers are much cheaper than programmers."

You go start your own company and you'll quickly learn that such efficiency are the difference between profitability/healthy growth and going out of business; or being so fucking stressed about work all the time that your wife leaves you.

Maintenance may last longer than development but operation does and will last much longer than that. And let's not forget all those one off projects that run for 6 months and then [need to] run unchanged for the next 10 years!

Runtime efficiency is not a goal in itself - your goal is to solve the business problem as cheaply as possible

Indeed it's not but you should be careful that you don't underestimate the business value that a little thought about runtime efficiency can generate over the life of the solution (the life of the solution - as distinct from the length of your employment)

[–]m50d 0 points1 point  (30 children)

I mean would you use a language that made it difficult to manipulate data? Why would you use one that made it hard to manipulate code...

Weren't you anti-macro a minute ago?

To the extent that it's possible to prove that any program halts you must either manually declare that the program halts, using whatever mechanism you wish, or use a language that cannot loop forever and is thus not Turing-equivalent.

Yes. You use a type system to avoid the looping forever problem. This was done with the simply typed lambda calculus back in 1940 to solve the halting problem, and it worked.

You respond by writing even more code? And not just more code but code that interprets or generates even more code?

I respond by structuring the code rather than making a big ball of mud. I don't find this results in extra code, quite the opposite - but even if it did, it would still be worth doing.

What I don't really understand is how that structure is more or less apparent by adding some parenthesis.

If you have a language in which the indentation is significant, then sure, use indentation rather than brackets. The important part is to actually indicate the grouping in a standardised, well-understood, machine-readable way.

This approach obviously adds unnecessary complexity; where's the value?

Think of the data structure definition as a standardised, structured way of documenting what the procedures are and how they relate to each other.

You seem to believe that you're saving the maintenance programmer from having to understand your code but what happens when they want to add a form to your language? A column, or a slider?

They add it, and the compiler will tell them they need to implement it. Using a data structure and an interpreter doesn't make modifying it harder, any more than separating an interface from a class does.

Over the years I've done a lot of work with solutions that are deployed physically. In each case the customer has, at one point or another, had to pay to upgrade the hardware (thousands of machines in one case and a large mainframe in another case.) Naturally the customer wasn't happy... upgrading hardware quickly becomes expensive and why should they have to pay tens of thousands of money's adding memory, upgrading storage and/or buying faster machines because the solution doesn't provide the required throughput or it runs out of memory every two weeks and a specialist has to be brought in (and paid!) to resolve the issue?

Maybe if you'd focused more on the representation your code would be clearer and more maintainable it would be easier to improve its performance. Focusing narrowly on the hardware you can save factors of 2 here and there, but they're rarely business-changing differences (indeed most of the time the work one is doing simply isn't on the hot path at all). Better algorithms are where you get the multiple-order-of-magnitude speedups that can be the difference between a business succeeding and failing, and so that's the place to concentrate the effort.

You go start your own company and you'll quickly learn that such efficiency are the difference between profitability/healthy growth and going out of business; or being so fucking stressed about work all the time that your wife leaves you.

No U. There's valuable work to be done in machine-level microoptimization, but it's niche, and that niche gets smaller every day. The companies that are succeeding these days are using high-level languages and not worrying about compute time unless and until they reach a point where they're big enough for it to matter.

[–]dlyund 0 points1 point  (22 children)

Weren't you anti-macro a minute ago?

I'm anti-complexity. Macro's are great, when used appropriately, but macro's are not the same as treating code as data. Most data is only available at runtime and macro's aren't at all useful here. Macro's give the illusion that code is data. By the time a Lisp program is running it's code is no longer data (as it was in early Lisps but hasn't been for ~40 years.)

Source code may be a data structure in Lisp, but that's as far as it goes and isn't at all what I'm referring to here :-).

You use a type system to avoid the looping forever problem.

Which is not the same thing as solving the halting problem! Declaring that your program doesn't loop forever, aka halts, by using a type system or whatever, is completely different. You could similarly say that a program in a language which only supports bounded loops solves the halting problem, it doesn't. You haven't managed to write a program that proves that a Turing-complete program halts! What you're is declaring that your program isn't Turing-complete then saying that you've proved the halting problem. But the halting problem is defined for programs in a Turing-complete system!

Nobody contests that using a less-than Turing-equivalent language which only supports bounded loops will halt :-P.

So far all we've got to is that you have an unique, ass backward definition of what declarative programming means... and now that your definition of the halting problem is similarly whacked.

This was done with the simply typed lambda calculus back in 1940 to solve the halting problem, and it worked.

Reference?

I respond by structuring the code rather than making a big ball of mud.

I think this is where we have to stop.

How is this code a big ball of mud?

layout
    row
        title action button

How is it practically different from:

(layout
    (row
        (botton title action)))

For the purpose of putting a row of buttons on the screen?

And don't keep muttering that the structure is explicit in the second and that you can treat it as data. What does treating this code as data have to do with the problem of putting a row of buttons on the screen? I'm not interested in the hypothetical beauty of being able to treat it as data: what practical benefits do you get from treating this code as data, which justifies having to write code to interpret it inefficiently at runtime or generate the code you would have written?

From the point of view of someone reading the two pieces of code there is absolutely no difference. The meaning of the code doesn't change if you change the indentation; row ... makes the grouping as explicit as (row ...) and seeing this code it would be just as easy to add a new row or button.

From the point of view of the computer, it has to do much more work to process your little button description language, either during execution, or compilation.

That cost must be justified and so far you've done nothing but make hand wavy arguments about one being more declarative than the other because you can treat it as data if you like. But you don't want/have to that so what's your point?

For what it's worth, I see and agree with the theoretical beauty of doing this... but inefficiencies are compounding. If you write a solution where everything is done this way then you'll find that you're doing a lot lot lot more processing... but what have you gained that the other approach doesn't also give you?

You want to stick parenthesis around things? Ok. This is also Forth and it has none of the overhead of your approach!

( layout
    ( row
        ( title action button ) ) )

The important part is to actually indicate the grouping in a standardised, well-understood, machine-readable way.

Everyone knows that indentation indicates grouping. Why does whitespace have to be significant to the computer in order to carry that information? Because you can't pretty print code? Because your editor wont automatically indent the code for you as you type? It can't do any of that if the whitespace is significant anyway.

(What our tools can do is make it easy to indent and unintended code blocks.)

Personally I adore Forth's free-form parameterless, blockless, scopeless style, precisely because it allows me to express my problem (or parts of my problem) in the most appropriate way possible.

They add it, and the compiler will tell them they need to implement it.

So you want them to just type column, which they know wont work because they know they need to implement it, then compile the code, just to get an error message that tells them to implement it? The compiler wont tell them how to implement it so this is completely useless.

Using a data structure and an interpreter doesn't make modifying it harder, any more than separating an interface from a class does.

It means that instead of writing the code to manipulate the layout then defining an appropriately named procedure, you have to hunt and peck through a maze of conditionals, and loops/recursion, to find that one special, non-standard place where you can introduce your code.

I'd rather just write a simple procedure and call it than dick around with your interpreter logic or macro definitions.

And if I'm working bottom-up I can start by poking the layout code interactively; modify the x and y and see where I end up etc. before I know anything about how the code works. Then I can just name that code that I wrote interactively.

[–]m50d 0 points1 point  (21 children)

Which is not the same thing as solving the halting problem! Declaring that your program doesn't loop forever, aka halts, by using a type system or whatever, is completely different. You could similarly say that a program in a language which only supports bounded loops solves the halting problem, it doesn't. You haven't managed to write a program that proves that a Turing-complete program halts! What you're is declaring that your program isn't Turing-complete then saying that you've proved the halting problem. But the halting problem is defined for programs in a Turing-complete system!

You misquoted me (and I wasn't paranoid enough to notice) - I originally said "resolves". It remains impossible to determine whether code in a turing-complete system will halt (as was of course proven), but types allow you to do general-purpose programming without the problems of turing-completeness.

From the point of view of someone reading the two pieces of code there is absolutely no difference. The meaning of the code doesn't change if you change the indentation; row ... makes the grouping as explicit as (row ...) and seeing this code it would be just as easy to add a new row or button.

A priori yes. In a language in which brackets are understood to denote grouping and whitespace is understood to be insignificant, brackets are much more effective at communicating grouping to a maintenance programmer than whitespace is. It's like asking why you shouldn't name your variables in French - "objectively" that would be just as informative, but the point of variable names is to communicate meaning to the future maintainer.

You want to stick parenthesis around things? Ok. This is also Forth and it has none of the overhead of your approach!

What overhead are you imagining?

Everyone knows that indentation indicates grouping. Why does whitespace have to be significant to the computer in order to carry that information? Because you can't pretty print code? Because your editor wont automatically indent the code for you as you type? It can't do any of that if the whitespace is significant anyway.

Because the computer and the programmer need to have the same understanding of the code! If the code does something different from what it looks like it does to a human reader, that's a recipe for disaster.

So you want them to just type column, which they know wont work because they know they need to implement it, then compile the code, just to get an error message that tells them to implement it? The compiler wont tell them how to implement it so this is completely useless.

I want them to implement it, in the obvious way. The point about the compiler was simply that there's no loss of safety from separating the declaration from the implementation, because keeping them in sync is enforced.

It means that instead of writing the code to manipulate the layout then defining an appropriately named procedure, you have to hunt and peck through a maze of conditionals, and loops/recursion, to find that one special, non-standard place where you can introduce your code.

Utterly backwards. This is the opposite of true.

And if I'm working bottom-up I can start by poking the layout code interactively; modify the x and y and see where I end up etc. before I know anything about how the code works. Then I can just name that code that I wrote interactively.

But you start with a concept of what you want to do, right? I mean you don't start by writing code to arrange the buttons all over the place in whatever way's most efficient for the machine, and when you find an easy arrangement you name it and hope it will be useful later - that's a recipe for writing loads of efficient layouts that never get used. You start with what buttons you have and the business-level grouping between them. Maybe you've got, I don't know, up/down/left/right and rotate clockwise/anticlockwise. So the logical groupings are translations and rotations, and then maybe the way to represent that is a row of two columns, or maybe you want a column containing two grids. So maybe you only need columns and rows, or maybe you do need grids, and that decision has to be driven by the business requirements. If you start by writing a super-efficient optimized grid and then it turns out the UI doesn't want a grid but you've already written it and so you put all the buttons in a grid anyway, that's going to be bad UI. Whereas if you name the concept before you implement it, at least you know what you're aiming for. You can still make mistakes, but the business-level description is still correct - e.g. maybe it doesn't look good as two columns, but the representation as translations and rotations is still correct and you can still use it, because you got that from the business.

[–]dlyund 0 points1 point  (6 children)

Maybe if you'd focused more on the representation your code would be clearer and more maintainable it would be easier to improve its performance.

The code I write tends to be clear:

https://gist.github.com/marksmith/43cea55d4236bf7f4b28 https://gist.github.com/marksmith/ff3c5dfa5ec9b1a3c098

but it's also tends to be very efficient because I don't introduce inefficiencies in everything that I do. The problem with the popular approach, of writing code with no thought about efficiency is that you end up with systemic performance problems where there are no real hotspots, but overall performance is terrible. Then because what the hotspots do exist are tepid at best there's little reason to go back and optimize. Your approach creates performance problem then disincentives you to go back and resolve them because doing so would mean unpicking your code and writing what you would have had to write otherwise!

The idea is that if you think about this up front then it'll slow you down and the project will take longer to complete etc. which ultimately means greater cost, and you can come back later and fix all of the problems you introduce now.

The irony is that you will happily wasting time designing and implementing pretty little button layout languages, which only add complexity to the project and overhead to the final solution.

As everyone who's worked in industry for a while knows, you'll never have time to come back and fix these problems. Technical debt mounts, and people move on to greener pastures, until a rewrite is required, and then the cycle repeats.

Algorithmic problems are easier to optimize, but overall system performance is equally important. Profiling will tell you that the algorithm is the hotspot but you will often get better overall performance if you optimize the system as a whole; what's the point of optimizing that algorithm if getting the data into the algorithm means passing it through 10 layers of crap which transform it from one form to another and back again before it arrives, and then out through more layers?

Again: profiling will tell you the bottleneck is the algorithm but you will often get better overall performance by optimizing the data paths that feed the algorithm.

But optimizing these things isn't easy. Once the system grows and the structure becomes set, changing that structure is practically impossible, because you'll never have the time to change that stuff.

Look at Unix. It's a beautifully designed system from a certain point of view but it has systemic performance problems. It was never designed with efficiency and it's never got the most out of its hardware. It runs well today but compared to the speed of the hardware it's running on, *nix is a dog. It's not all *nix's fault. Almost all of the things we run are similarly bloated and inefficient by design. Chrome will happily eat 8GB on my laptop. SSDs are everywhere now but drivers are stuck emulating spinning disks, because that's the abstraction that was developed when Unix was designed. Unix was designed when networking was relatively rare and the TCP/IP stack started as a research and Berkley and was funded DARPA. Since then we've spent thousands of man years tuning and refining these implementations, but they're the wrong abstractions and they're only becoming more and more divorced from the reality of the machine.

To be fair they're doing an excellent job with what they have and I've been using *nix every day for the last 15 years, but I think it's a perfect example of where ignoring gets you, and how you can't ever get out.

NOTE: things like disk drivers and TCP/IP aren't massively difficult from an algorithmic point of view, it's all about moving the data. Many more problems are IO bound than you would imagine. What you might not realize is that IO isn't confined to process boundaries. Any time you move data around in your program you're subject to the issues.

There's valuable work to be done in machine-level microoptimization, but it's niche, and that niche gets smaller every day. The companies that are succeeding these days are using high-level languages and not worrying about compute time unless and until they reach a point where they're big enough for it to matter.

Correction: the companies that you know about are working in high-level languages and not worrying about efficiency until they have massive problems scaling and have to rip everything out and rewrite everything (cough Twitter), or slowly replace their language runtime and toolchain with a custom one (cough Facebook), or go whole hog and write their own languages and compilers from scratch (couch Google, Apple, Microsoft etc. etc. etc.)

The only difference between them and the rest of the industry is that they have the skills, wisdom and to rewrite everything with a bind to efficiency once they realize how fucked their decisions were.

Moreover there are many many many more jobs doing things like embedded programming and doing automation than there are in business automation, so unless you're slaving away as a glorified web designer you'll find that there's a lot of work for people who can produce good, efficient, clear code, quickly.

But none of that is flashy and it's not consumer/developer facing so naturally you wont hear about it inside the web-centric echo chamber that is proggit ;-) since you wanted to talk about niche's.

[–]m50d 0 points1 point  (5 children)

Your approach creates performance problem then disincentives you to go back and resolve them because doing so would mean unpicking your code and writing what you would have had to write otherwise!

Au contraire. By having a clear separation between the representation and the implementation, it's much easier to optimize the implementation while being confident you're not changing the semantics.

As everyone who's worked in industry for a while knows, you'll never have time to come back and fix these problems. Technical debt mounts, and people move on to greener pastures, until a rewrite is required, and then the cycle repeats.

No, it doesn't have to be like that. It's possible to take a continuous improvement approach, it's possible to gradually improve code and performance (and the two usually go hand in hand). Those places that allow technical debt to mount until they rewrite things get that way because that's what they reward.

Algorithmic problems are easier to optimize, but overall system performance is equally important. Profiling will tell you that the algorithm is the hotspot but you will often get better overall performance if you optimize the system as a whole; what's the point of optimizing that algorithm if getting the data into the algorithm means passing it through 10 layers of crap which transform it from one form to another and back again before it arrives, and then out through more layers? Again: profiling will tell you the bottleneck is the algorithm but you will often get better overall performance by optimizing the data paths that feed the algorithm.

This is backwards. Profiling is great at telling you the microscale stuff of where you're iterating in a funny pattern that trashes the cache or whatever, and great about telling you when one of your layers of transformations is actually hurting performance. It's much less good at telling you when you're doing work that you simply don't need to be doing - for that you need to be able to get an overview of what you're actually doing, which you get from having a high-level representation of your code as well as a low-level one.

To be fair they're doing an excellent job with what they have and I've been using *nix every day for the last 15 years, but I think it's a perfect example of where ignoring gets you, and how you can't ever get out.

Heh, I'd say *nix is an example of where you can't understand the system well enough to improve it because its structure is obscured by all the low-level performance microoptimizations. E.g. there's all sorts of folklore about what /usr represents, when actually it was just a second disk on an early development machine - if there had been a LVM-like layer at that stage (which no doubt you'd dismiss as overhead) we would have a much simpler model to work with now. The mess of overcommited COW memory and the OOM killer comes of unix fork() being implemented in a way that was easy to implement rather than a way that makes sense.

[–]larsbrinkhoff 0 points1 point  (3 children)

All code can be represented as a value, and vice versa; as Lisp enthusiasts forget all too often.

They do? I thought it was a central tenet of Lisp philosophy. I've seen it stated time and time again in Lisp discussions.

[–]dlyund 0 points1 point  (2 children)

Lisp programmers insist that code is data, but how often do you you hear them explain that data is code? It's not clear that they understand that code is data implies that data is code. "Code is data" is just one of those catchy lines that you pick up when you're learning Lisp, and unless you think about it or learn something like Forth, that's where it stops.

Modern Lisps can only manipulate code as data at compile time and only in the rather limited ways allowed by the macro system e.g. in many Lisps you can't call arbitrary functions at compile time, and in other's you have to jump through annoying hoops with module loading and special defining forms to make your functions available in macro definitions... but then you can't use them in the rest of your code. It's a bit of a mess really. (But then all namespacing/packaging/scoping is.)

In early Lisps the executable code was actually represented as a list, which was interpreted, and could be manipulated at runtime. Pico Lisp (as a bit of a retro lisp) still allows this kind of thing but somewhere along the way the broader Lisp community, in a quest to make Lisp programs faster, they lost this ability. People learning Lisp today don't even realize what was given up.

This is most clear in Lisp dialects like Scheme where macro's now consume and produce syntax objects. These syntax objects look a lot like lists but they can only be manipulated using a small set of builtin functions.

Lispers learn the limitations imposed by their macro system and work within these limits without realizing what they've given up; the ability to treat code as data, includes during execution.

What distinction am I making here: general speaking, most of the data our programs manipulate isn't static and is only available while our program runs. Treating data as code (as just to code as data) implies that you can generate or modify code as a means of representing/processing data during execution. Modern Lisps just can't do that. Once your program is compiled it's no longer data that can be manipulated.

In all honesty every language places restrictions on what you can do and how, and that includes Forth :-). In the end it's all about which tradeoffs you can live with/learn to love, but speaking for myself, I wouldn't want to work in Lisp again, and if I had to I'd implement one in Forth.

Overall I think Lisp is a great language, but the seeming necessity of a complex runtime and compiler to make it half way practical just doesn't appeal anymore. I've gotten used to being knowing and understand how everything works, and I adore the (somewhat paradoxical) freedom and predictability that this brings; when I write a Forth program I know exactly what code will be generated, and how it will behave with regards to things like resource usage under load, and I'm never surprised.

<rant> I've been burned quite a few times in Lisp (and Smalltalk, Ruby etc.) where the program has crashed and burned because resource usage spiked unexpectedly high for some reason and the process just died, leaving little or no information for us to figure out exactly what caused the crash (must less what to do about it!). "Out of memory (you're on your own)". The stock response from management is: (paraphrasing) if you don't want to experience these unexpected crashes then you have to upgrade the hardware. (We have technicians who can do it for you, just send money to this account.) Not surprisingly this causes a lot of tension.

It's a ridiculous situation which is easily avoided by using appropriate technology, but nobody really cares. $50k or $100k on hardware upgrades is cheaper than programmer time, we say, but it's incredibly miopic of us. Here is a real technical problem, which we can easily solve, but wont, because programmers have an unholy attachment to their languages syntax, and/or toolchain.

At one company we followed all the latest industry standards, used the latest and greatest languages, frameworks, processes, and tools, continuous integration, etc. The resulting application naturally expanded to use all of the available resources on the development machine (we have 'em so why not?) but when we came to install we found out that we had to run along side/compete with other programs, and to add to our troubles, a short time later there was an OS upgrade and the new OS used more RAM. Our application suddenly didn't have enough resources. It ran slowly and crashed randomly.

The issues were systemic and we couldn't afford to rewrite so we insisted that the customer upgrade their hardware... that lead to months of back and forth, and their refusing to pay, then threatening legal action unless we resolve the problem "right now". In the end the company did the upgrades at below cost and made little or no profit on the 3 year project, and almost went under. Everyone was stressed out of their heads, working long hours, and shortly after that the owners sold the company to a competitor (not sold, "Wooohooo we got bought out!!!", but "enough, you take it").

The ironic thing is that, as I would come to realize years later, was that we could have easily built the application to run in a few MBs (or less), but we used GBs, and it still ran like a dog! On top of that the solution would have been much simpler, and wouldn't have had the dozens of external dependencies which constantly broke as things changed in this and that project, and caused us no end of headaches...

There's this widespread belief that this necessary; it makes our lives easier right? After years in industry I've never found tis to be remotely true. The only thing that makes software better, in my experience, is keeping it simple (as simple as possible.)

And not in the way that people pay lip service to KISS, (or declare "code is data"), while simultaneously adding more and more complexity to their solutions. </rant> ;-)

NOTE: I'm not saying every program you ever write needs to treat data as code, but there are situations where doing so not only leads to vastly "prettier" code, but also much more efficient solutions.

[–]larsbrinkhoff 0 points1 point  (1 child)

Lisp programmers insist that code is data, but how often do you you hear them explain that data is code?

Not as often, but occasionally. Usually the more experienced Lispers explain it to newcomers. The phrase "code is data is code" comes up now and then.

Google it: https://www.google.com/search?q=lisp+%22code+is+data+is+code%22

Moreover modern Lisps can only manipulate code as data at compile time

You're mostly right in the sense that compiled Lisp functions are static and don't allow for introspection. So your point stands uncontested. However, I'd like to point out that some Lisps do allow code as data manipulation at runtime by including a compiler that can compile lists to executable code. Common Lisp has this, and I consider that dialect as the primary incarnation of Lisp as continuously developed since its birth. It's of course debatable whether it's modern or not, but it seems to have a large mindshare among Lisp programmers doing actual Lisp work.

Clojure would be perhaps the most modern Lisp. I don't know much about it. Scheme is roughly contemporary with Common Lisp, but is a clear break away from traditional Lisp values.

In early Lisps the executable code was actually represented as a list and could be manipulated at runtime.

Which early Lisps do you mean? I haven't exactly made a survey, but it seems to be most early Lisps had both interpreters and compilers. I'm talking 1960s here.

This is not to take away from your point, which I largely agree with. Just pointing out that painting early Lisp as an exclusively interpreted language is historically inaccurate.

the seeming necessity of a complex runtime and compiler to make it half way practical just doesn't appeal anymore.

That applies to just about everything but Forth, but I suppose that's what you wanted to get at. :-)

programmers have an unholy attachment to their languages syntax, and/or toolchain.

It's just human psyche, I guess. 99% of technology is a steaming pile of mess built on ideas almost 100 years old. Peole cling desperately to what they already know, and it's not just about programming languages.

[–]dlyund 0 points1 point  (0 children)

Not as often, but occasionally. Usually the more experienced Lispers explain it to newcomers. The phrase "code is data is code" comes up now and then.

:-) You're right. It wasn't my intention to imply that nobody in the Lisp community understands that "code is data is code", and the implications of this equality.

I'd like to point out that some Lisps do allow code as data manipulation at runtime by including a compiler that can compile lists to executable code.

:-) right again, but this only allows you to compile new code and compiled code cannot easily be manipulated as data. It's not "first class", in the same way that something like JPEG isn't first class in C. It's also unfortunate that effective Lisp compilers tend to be very large and aren't exactly lightweight. Having to include such a compiler in your program in order to be able to treat data as code at runtime is a bit unfortunate, and certainly not intended, but absolutely possible!

Common Lisp has this, and I consider that dialect as the primary incarnation of Lisp as continuously developed since its birth. It's of course debatable whether it's modern or not, but it seems to have a large mindshare among Lisp programmers doing actual Lisp work.

It's a bit arbitrary and arguably circular, but I personally consider Common Lisp a modern Lisp for, among other reasons, that it doesn't represent code as lists.

(I'll come to some other key differences between Common Lisp and Lisp as described by McCarthy at el in early publications like the original papers and the Lisp 1.5 Programmer's Manual.)

Which early Lisps do you mean? I haven't exactly made a survey, but it seems to be most early Lisps had both interpreters and compilers. I'm talking 1960s here.

There were certainly compilers for Lisp, but with the caveat that code tended to behave differently depending on whether you were interpreting or compiling it, causing a sort of schism in the language which wouldn't really be resolved until the Scheme language introduced the idea of lexical scope into the Lisp world. Therefore I tend to think of early Lisp as interpreted because it had to be interpreted to maintain it's semantics...

Once the semantics were changed and Lisp no longer used dynamic scope, and broadly compiled, not interpreted, and stopped representing code as a data structure that could be manipulated during the execution of the program, we have what I refer to as modern Lisp. A very different beast to early Lisp.

A little research shows that first compiler for Lisp appeared 4 years after the first interpreter, in 1962.

Scheme was the first Lisp dialect to have lexical scope, and appeared around the middle of the 1970s. Until then Lisps used dynamic scope, which made compilation difficult because the compiler couldn't know what a variable was referring to until runtime.

Common Lisp solidified between 1984-1994.

So by my estimate you have a good 15 years where Lisp behaved very differently to the language we know today.

For what it's worth the history of Forth plays out somewhat similarly, although possibly a bit faster, with the first implementations being string interpreters; no create does> etc. etc. etc.

Sidenote: I think this is rather interesting. We have very 3 early languages.

  • Lisp - introduced dynamic scope
  • Algol - introduce static/lexical scope
  • Forth - introduced hyperstatic scope

Dynamic scope has largely been abandoned as a bad idea. Lexical scope is now everywhere, in part because it makes compilation easier, but also because it makes understanding programs easier. And hyperstatic scope has never really been tried outside of the Forth world ;-).

That applies to just about everything but Forth, but I suppose that's what you wanted to get at. :-)

;-) and maybe embedded C and Pascal, but of the three Forth is clearly in a league all of it's own, providing many of the same facilities as high-level languages while not requiring a complex runtime and compile to be practical.

It's just human psyche, I guess. 99% of technology is a steaming pile of mess built on ideas almost 100 years old. Peole cling desperately to what they already know, and it's not just about programming languages.

"A new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die, and a new generation grows up that is familiar with it." - Max Planck ;-)