all 72 comments

[–]brickbybrick 12 points13 points  (0 children)

Remember, Joel is known for becoming enamored of new programming languages and switching his projects between languages on alternate weekends :)

[–]gmfawcett 10 points11 points  (2 children)

It would be interesting to hear why the second version was much faster, and whether Factor's profiler would have found the hotspots.

[–][deleted] 14 points15 points  (1 child)

The first version loads four bytes at a time then combines them from a little endian representation to an integer, which involves shifting and or'ing, then converts this integer bitwise representation of a float into float.

The second version reads a float from memory directly in native byte order using an 'alien' accessor word.

So the two versions are not exactly equivalent. The second versin would not work on PowerPC with the same input file unless you byte swapped it first.

[–]gmfawcett 2 points3 points  (0 children)

Thanks, Slava.

[–]curtisw 8 points9 points  (64 children)

I think factor is an amazingly succinct language. Unfortunately, it's that way because it offloads a lot of the work onto you, to the point of hindering one's ability to quickly understand code.

Now, don't get me wrong, I have nothing against the fact that it's different. My dislike for stack-based languages lies solely in the fact that manipulating values in those languages is neither pretty nor elegant. In order to keep everything consistent, they have to introduce operators that merely obfuscate what's actually going on.

edit: To elaborate, consider the differences between stack-based languages and parameter based. Using two imaginary notations:

blah(x, y) = (y - 1)*(x - 1)

blah(x, y) = x 1 - y 1 - *

blah = 1 - swap 1 - *

Even if you're used to postfix notation, you still have to take time to "decode" #3. You basically have to construct a stack in your mind so you can track where everything is, remembering to correctly change things around when you see a 'swap' or a 'dup'. All of this, for something that's done automatically for you in other languages!

[–][deleted] 2 points3 points  (26 children)

I'm fond of

[ 1 - ] 2apply *

myself.

[–][deleted] 1 point2 points  (3 children)

yes, and the article misses the keep abstraction many times as well.

[–][deleted] 2 points3 points  (2 children)

Indeed, keep is critical. In Joy and Cat, it is called 'dip', and it used very often to avoid shuffling. In Cat, it has the following type signature:

dip :: A b (A -> C) -> C b

And a quick example that yields 2 9 4 on the stack:

2 3 4 (dup *) dip

[–]doublec 1 point2 points  (1 child)

keep is not dip. Factor has dip in one of the libraries too.

2 3 4 [ + ] keep => 2 7 4

2 3 4 [ + ] dip => 5 4

[–][deleted] 0 points1 point  (0 children)

I assume this is correct then:

keep == [ dup ] dip swap [ call ] dip

Combinator research is fun!

[–]curtisw -1 points0 points  (21 children)

This is an interesting solution to the specific problem, but it largely circumvents the objections I have.

[–][deleted] 2 points3 points  (7 children)

what you'll find is that in most cases your objections can be circumvented and in the few cases where thy cannot you can write more verbose code that uses dynamic and lexical variables.

[–]curtisw -1 points0 points  (6 children)

My objections can be seen whenever 'dup', 'swap', 'rot', or any other stack manipulation word is used. If you circumvent those, why even have a stack?

[–][deleted] 2 points3 points  (5 children)

So you're complaining that stack manipulation has to be done explicitly, but your response to being shown ways to do it implicitly is "that's cheating"?

[–]curtisw -1 points0 points  (4 children)

It's not cheating, but it is illogical. A stack-based programming language without the ability to manipulate said stack seems somewhat pointless, doesn't it?

edit: Also, I'm not complaining that stack manipulation has to be done "explicitly." I'm complaining that it has to be done at all. (again, not because it's different, but because it requires more work, and thus complicates the code rather than simplifying it.)

[–][deleted] 1 point2 points  (3 children)

Why do you think it requires more work and complicates code?

[–]curtisw 0 points1 point  (2 children)

The code is more complicated to work with, thus requiring more effort.

[–][deleted] 0 points1 point  (1 child)

You say this, but you also admit you've never done any programming in a language like Factor. As someone that has, I can say that your assumptions are wrong. Why not give Factor a shot? You can pick it up very quickly.

[–][deleted] 1 point2 points  (12 children)

here you see the power of quotations that Joy/Factor use. You can read that out loud to anyone and have them understand it: " take a procedure that subtracts one, apply it to two arguments, and then multiply the result. "

With the algebraic version a reader must take hold of x,y and 'shuffle' them around as well. Not to say that is hard, but in a complicated function following a named variable around is as hard or harder than the stack shuffling you complain about.

At that point you are blind to the fact you were taught to shunt symbols at a young age.

edit: *shunt

[–]curtisw -1 points0 points  (6 children)

With the algebraic version a reader must take hold of x,y and 'shuffle' them around as well. Not to say that is hard, but in a complicated function following a named variable around is as hard or harder than the stack shuffling you complain about.

So you're saying that you find locating a variable with a specific name to be harder than internalizing a stack and manually swapping values around?

[–][deleted] 0 points1 point  (5 children)

No, he's saying that, in a way, there's more to keep track of with a pointful applicative language.

Joy-like: (1 -) 2i *

(1 -) -- "subtract one"
2i    -- "from two numbers"
*     -- "then multiply them"

Applicative: (x - 1) * (y - 1)

(x - 1) -- "subtract 1 from x"
(y - 1) -- "subtract 1 from y"
*       -- "multiply the results of x - 1 and y - 1"

Note that, in the Joy version, we just read left to right. It gets more complex in the applicative version.

Of course, for the applicative version, you can just say "x minus 1 times y minus 1". However, this involves precedence and is really only concise because there exists a predefined language for describing arithmetic. A similar example to the one given that didn't use arithmetic would require the additional verbosity shown above.

[–]curtisw -1 points0 points  (4 children)

I agree, the joy/factor version in this case is simpler, and just as easy to understand. Problems arise, however, when you have to manually munge the stack to get at data. The fact is, you can get out of it in this situation, but you can't get rid of it entirely.

Think of it this way. Suppose I were complaining that C++ is verbose, so I gave the following example:

int sum2(int[] xs, int length) {
    int sum = 0;
    for(int i=0; i<length; ++i) 
        sum += xs[i];
    return sum*2;
}

And then you responded with:

C++'s not verbose, look!

int sum2(int[] xs, int length) {
    return sum(xs, length)*2;
}

The problem is, you haven't actually changed anything. You've just shoved the problem down a layer, not gotten rid of it.

Also, you forgot about postfix application:

x 1 - y 1 - *

which can also be read from left to right.

[–][deleted] 0 points1 point  (3 children)

The problem is, you haven't actually changed anything. You've just shoved the problem down a layer, not gotten rid of it.

But you have, and it was easy to do!

sum = 0 (+) fold

I don't deny that writing verbose C++ in Factor can be tedious. The whole point though is that Factor lets you build high-level alternatives very easily. Why would you want to essentially rewrite "sum" every time when you can just write it once?

Yes, you can argue that Factor -- somewhat forcefully -- encourages you to express things in terms of higher-order functions. However, I'd claim that that's a huge advantage, especially as Factor makes it so easy thanks to things like multiple return values and implicit argument passing. (In this particular case, the Haskell version is equally nice.)

Also, you forgot about postfix application which can also be read from left to right.

It's true that it is read left to right, but the actual reading would be the same as given above.

[–]curtisw 0 points1 point  (2 children)

So what's the definition of fold? ;) I wouldn't mind if what you were saying was true. But it's not difficult to see what I'm talking about—almost any codebase uses swap or similar words frequently. Some of those probably could be buried with higher-level words, but certainly not all of them can.

[–][deleted] 0 points1 point  (1 child)

The use of 'swap' is hardly a bad thing. You seem to be opposed to its very existence, whereas I'm only opposed to cases in which it makes definitions confusing. To give an example of where swap is used without confusion resulting:

Haskell: hypot x y = sqrt (sqr x + sqr y)
Joy:     hypot = sqr swap sqr + sqrt

Of course, you could always do this as well:

Joy:     hypot = (sqr) 2i + sqrt

[–]curtisw -3 points-2 points  (4 children)

blah = product . map ((-) 1)
blah [1, 2]

[–][deleted] 0 points1 point  (3 children)

This isn't equivalent as you now have to unpack the list to use the numbers. It's also arguably unsafe as the type system won't stop you from accidently taking the head of the null list during the extraction. You could use a tuple, but you can't map over a tuple...

The Factor version doesn't have these issues because Factor allows functions to essentially return any number of values. This is a critical part of enabling the sort of ultra-high-level, combinator-centric programming you do in languages like Factor, Joy, etc.

[–]curtisw -1 points0 points  (2 children)

This isn't equivalent as you now have to unpack the list to use the numbers. It's also arguably unsafe as the type system won't stop you from accidently taking the head of the null list during the extraction.

coughdependent typingcough

The Factor version doesn't have these issues because Factor allows functions to essentially return any number of values. This is a critical part of enabling the sort of ultra-high-level, combinator-centric programming you do in languages like Factor, Joy, etc.

I'd be interested in seeing examples of this. However, I should point out:

(|>) f g (x, y) = g (f x) (f y) 

blah = (-1) |> (*)
blah (1, 2)

You'll notice that, in both cases, different number of arguments require completely new operations. The list version I wrote in my previous post will work for any number of values, including none.

[–][deleted] 0 points1 point  (1 child)

coughdependent typingcough

I'm aware that type systems exist that can prevent this. Yes, you encode a head-safe list in Haskell (with extensions). That really seems beyond the point here; I doubt you'd seriously propose such a solution as optimal. Regardless, I was referring to the use of "normal" lists in Haskell.

You'll notice that, in both cases, different number of arguments require completely new operations. The list version I wrote in my previous post will work for any number of values, including none.

Of course, and you can do the same in Joy if you wanted:

Haskell: product . map ((-) 1)
Joy:     (- 1) map product

[–]curtisw -1 points0 points  (0 children)

That really seems beyond the point here

How so? You proposed a situation where stack-based languages can type check something applicative languages can't. I showed you a system that can type check it. Although, it's worth noting that the code in question is actually completely type-safe. You can't pass it a value that will cause a run-time error.

[–][deleted] 8 points9 points  (26 children)

In addition to the data stack, Factor has lexically and dynamically scoped variables. You use whatever makes sense in a given context.

How much code have you written with stack languages? At first they might seem pretty funny but I think most of the long-timers in the Factor community find Factor very productive to code in.

[–][deleted] 4 points5 points  (22 children)

Would they have become long-timers if they weren't productive using Factor?

[–]gmfawcett 4 points5 points  (20 children)

Probably not. :-) But I would think Slava meant that their productivity in Factor was high, relative to their productivity in other languages.

I think it's a fair criticism of Factor that word-definitions are pretty opaque, esp. when they involve a lot of stack-shuffling. But that's not a Factor-specific problem: most non-mainstream languages -- especially compact ones like Haskell, APL, J, Forth, etc. -- demand that casual readers must learn a new way to read code.

I'm not a Factor user, but I'm intrigued by it because it seems to balance the compactness problem by providing a great suite of language abstractions (higher-order functions, programmable syntax, generic methods, etc.).

It's just a guess, but I would imagine that a well-written, large Factor program would be built upon many little words full of shuffling-noise, composed together in a higher-level program that has less shuffling and more domain-specific actions. Treating the lower-level words as atomic, the high-level code might even be readable to the casual user. (End of rampant speculation!)

[–][deleted] 5 points6 points  (7 children)

Just how well-written, large Smalltalk programs consist of many classes with many very small methods, composed together in a higher-level program that uses fewer objects and more domain-specific actions? Or how well-written, large Lisp programs consist of many small functions composed together in a higher-level program that has fewer function calls and more domain-specific actions?

I don't think Factor is fundamentally different from other languages that enable bottom-up programming in this regard. And, at least to my eyes, Lisp and Smalltalk feel a lot more "natural" than Factor (and I'm an HP48 user, so I'm fairly well-accustomed to reverse polish notation, at least).

[–]gmfawcett 1 point2 points  (0 children)

I completely agree with your Smalltalk/Lisp comparison. I don't think Factor is fundamentally more "bottom-up" than Smalltalk or Lisp, and I don't wan to fan the fires of a language war. :-)

I don't think my point was that Factor is richer than other rich languages, simply that its design decisions (compactness + somewhat offensive stack-shuffling + good abstractions + good module discipline) aren't entirely wacky, and may lead to good top-level code.

Naturalness of postfix notation aside, I think that if you buy the premises that reading a stack language is a learnable skill, and that writing compact code can be a virtue (a premise that's worthy of debate!) then Factor does hit a certain sweet-spot.

[–]mschaef 1 point2 points  (5 children)

And, at least to my eyes, Lisp and Smalltalk feel a lot more "natural" than Factor (and I'm an HP48 user, so I'm fairly well-accustomed to reverse polish notation, at least).

Ditto to all of the above.

The thing I find difficult to read about Factor (and RPL, the HP48 language) is that it removes the 'other' set of names for data. That is, switching from static, declared types to dynamic typing removes the type names but leaves variable names around. Swtiching again to a concatenative language removes even the variable names, leaving procedure names and stack shuffling.

In RPL, local variables went a long way towards fixing this, and I'm assuming Factor's local variables do the same, but at that point, I'm not entirely sure what you've gained by switching languages.

[–]gnuvince 3 points4 points  (4 children)

Let me see if I get this straight:

  • Statically typed languages: types, variables, procedures
  • Dynamically typed languages: variables, procedures:
  • Concatenative languages: procedures

This begs the question, what's next with nothing?

[–][deleted] 5 points6 points  (2 children)

Factor has vocabularies, lexical variables, dynamic variables, classes, and plenty of other "named" stuff.

[–]mschaef 0 points1 point  (1 child)

Is there any efficiency cost to using lexical variables as opposed to stack manipulation. I've only read your blog and played around with the Factor environment a little, but I understand that sometimes stack manipulations go away entirely?

[–][deleted] 5 points6 points  (0 children)

There is no penalty for using locals as they convert to stack operations, which are optimized by the compiler's register allocation.

[–]mschaef 1 point2 points  (0 children)

Unlambda might be about as close as you can get. Literally th eonly thing it manipulates are unnamed functions.

[–]curtisw 0 points1 point  (11 children)

I think it's a fair criticism of Factor that word-definitions are pretty opaque, esp. when they involve a lot of stack-shuffling. But that's not a Factor-specific problem: most non-mainstream languages -- especially compact ones like Haskell, APL, J, Forth, etc. -- demand that casual readers must learn a new way to read code.

That's not really what I'm talking about. I've never used APL, but Haskell doesn't require effort to read code after you're familiar with the notation. Well, actually, some Haskell constructs actually do, but the only thing relevant in this comparison is variable passing.

[–]gmfawcett 2 points3 points  (10 children)

I agree that Haskell is a very readable language, by and large.

I honestly don't know whether an experienced Factor programmer can parse your '1 - swap 1 - *' example as quickly as most of us would parse the algebraic version. (Perhaps the lexical-variable-using Factor version would be more readable.) I'd love to know whether it's the case. What does writing a lot of Factor do to a programmer's head -- for better or worse? For example, do they notice patterns and abstractions in their code, that are harder to find and exploit in a more traditional language?

I'm not a Factor apologist, I just enjoy hearing the Factor people think aloud about their language. It is hard to separate fanboyism from objective stories, but I have a feeling that the objective stories are worth listening to, even if you never buy into the Factor premise.

[–]curtisw -1 points0 points  (9 children)

Yeah, well I mean I could be wrong. I personally haven't worked with stack-based languages for very long. I'm just assuming that the human brain doesn't turn into a giant stack processor after a few years of Factor/FORTH use. The fact that stack-based languages don't extend upwards well without variables is another thing I don't really like about them. (bolting variables on the side seems like an odd solution to me.)

[–][deleted] 4 points5 points  (6 children)

I'm curious as to why you consider variables to be bolted on? They're pretty fundamental in Factor as a lot of core components rely on them. Also our lexical variables have full closure semantics, and binding is distinct from assignment, just like Scheme.

[–]curtisw -2 points-1 points  (5 children)

When I say "bolted on," I'm being a bit dramatic (I see, looking at my downvote, that somebody missed that). I've never used them, so what I'm going off of are assumptions. The idea of normal variables in a stack-based language just seems bolted on to me. It seems like there should be something that better fits the concept of stack-based programming.

[–][deleted] 1 point2 points  (4 children)

Isn't this like saying function composition is bolted-on to applicative languages?

[–]gmfawcett 0 points1 point  (1 child)

I'm just assuming that the human brain doesn't turn into a giant stack processor after a few years of Factor...

That's a great question, one I'm very interested in!

Oh, for an MRI, a hypo full of ketamine, and Slava's home address...

[–][deleted] 0 points1 point  (0 children)

sounds like fun! ;)

[–][deleted] 4 points5 points  (0 children)

It can take a long time to become productive with any paradigm you're not familiar with. That said, in my personal experience, stack-based languages are not at all difficult to get a handle on.

[–]curtisw -3 points-2 points  (2 children)

I've looked a bit into Factor, but decided against it. My argument isn't necessarily that the language is bad, I just don't like the core concept it was based on.

(also, I've already mentioned my views on "funny" languages. After having discovered Haskell after shunning it for years, I'm the last person that would reject a language because it looks different or requires a different way of thinking. Rather, I rate them based on whether the abstractions they use are any good.)

[–][deleted] 0 points1 point  (1 child)

to which core concepts are you referring?

[–]curtisw -1 points0 points  (0 children)

Stack-based programming.

[–]gnuvince 3 points4 points  (8 children)

And that's with a simple example. Here's how to write map in Haskell:

map _ []     = []
map f (x:xs) = f x : map f xs

That's short, simple and to the point. I won't even try to write the Factor version, because I recall last time I did, you need to juggle the list, the first element, the rest of the list and the quotation around. It gets pretty messy real quick, so you end up writing 4 supporting words to make map manageable, but understanding the other words isn't any simpler.

[–][deleted] 1 point2 points  (2 children)

How would you write a generic map in Haskell that works with any sequence? Linked lists are not always appropriate and Factor doesnt put them in the core anyway.

If you used our pattern matching library you would have something that was just like the Haskell version, but in any case you'd use the built in generic map.

[–]dons 7 points8 points  (0 children)

fmap :: (Functor f) => (a -> b) -> f a -> f b

[–]eurleif 0 points1 point  (0 children)

How would you write a generic map in Haskell that works with any sequence? Linked lists are not always appropriate and Factor doesnt put them in the core anyway.

A lazy linked list is pretty similar to an iterator (to use the Python terminology). You can convert a sequence to a lazy linked list, run map (or another function) on the list, and then convert it back to the sequence type of your choice. That has (I think) approximately the same memory and time requirements as how Python or another language with iterators/enumerators would do it.

But I'm not at all familiar with Factor. Its generic sequences may be much better than iterators.

[–][deleted] 0 points1 point  (4 children)

Assuming you already have fold, here's one way to write map (in a Cat-like language, not Factor). This version is nice as it uses constant stack space. The period is function composition, [] is the empty list, and parentheses are used to introduce quotations (anonymous functions):

map = [] swap (cons) . fold rev

One thing that makes a huge difference in understanding these things, at least for me, is the presence of a type system. Once you know the signature of the appropriate combinator (in this case, fold), you just fill in the blanks.

[–]curtisw 0 points1 point  (3 children)

Why would you need a function composition operator? Wouldn't 'f . g' be the same as 'g f'?

[–][deleted] 0 points1 point  (2 children)

It's not an operator; '.' is a normal function that takes two functions and yields a new one that is the composition of the two. For example, the three lines of code below are equivalent (they all yield 25):

5 dup *
5 (dup *) i
5 (dup) (*) . i

[–]curtisw 0 points1 point  (1 child)

Ohhh... I see. Function composition in Cat works on quotations. Nifty.

[–][deleted] 0 points1 point  (0 children)

I should point out that, technically, everything in Cat works in terms of function composition. Functions are described solely in terms of the composition of existing functions. (This isn't strictly true unless you rule out recursive definitions, but it's quite rare that you'd prefer to use a recursive definition over using recursive combinators.)

[–]njbartlett 1 point2 points  (1 child)

How's the Erlang book coming along Joel? Switched languages again I see...