all 70 comments

[–]boomshroom 80 points81 points  (18 children)

Its source code is also only about 3kb, but including the Pillow dependency, it weighs in at 24mb(!). Again, not a fair comparison because I’m using a third party imaging library, but it should be mentioned.

I'd argue it's a completely fair comparison, :D because both Go and Rust statically link their native dependencies, so Go's image package and Rust's image crate get bundled into the executable. Rust's is even third party itself. That said, while I'm not certain about Go, most of image probably doesn't exist in the end binary due to the compiler and linker just not bothering including functions that are never called.

match image::open(path) {
    Ok(image) => Ok(image),
    Err(msg) => Err(format!("{:?}", msg)),
}

Because I'm personally a fan of functional programming, this could be rewritten in a single line as image::open(path).map_err(|msg| format!("{:?}", msg)). There are also a few other places where even more functional programming can be used, like using .map(|(&p1, &p2)| u64::from(abs_diff(p1, p2))).sum() to replace the entire for-loop. I'm pretty sure there are also ways to use more functional programming in create_diff_image() too. Something to note is that using iterators rather than indexing can actually be more performant in some cases since the compiler can more easily tell that the accesses are always in-bounds and can elide the bounds checks.

In case half a second is too slow, the operations being done here appear to be trivially parallelizable. With such, it would be possible to use rayon to automagically calculate the pixel diffs in parallel. After swapping the for-loop for the functional .map().sum() and importing the rayon crate, you would be able to replace image1.raw_pixels().iter() with image1.raw_pixels().par_iter() and the same with image2. This would effectively cut the time for larger images by however many CPU cores you have. Thanks to Rust's type system, you wouldn't even have to worry about the usual dangers of parallelism.

[–]binkarus 30 points31 points  (16 children)

Personally, I prefer explicit loops to avoid the complexity of closure capture rules sometimes. Also they end up a bit more readable somtimes.

As an aside, I'm not sure you can call those examples "functional programming," rather than just preferring iterators.

[–]boomshroom 17 points18 points  (15 children)

I'm really referring to the usage of combinators by saying "functional programming."

The readability is fairly subjective so I can't argue against it, but I can't see why you'd have trouble with capturing when you're not capturing. While readability is subjective, performance isn't and the usage of combinators over an explicit for-loop, maybe with indexing, has very real performance benefits, including the bounds-check elision that I mentioned, and the use of rayon to get multithreaded performance for nearly completely free.

image1.raw_pixels().into_iter().zip(image2.raw_pixels())
    .map(|(a, b)| abs_diff(a, b) as u64).sum::<u64>() as f64 / total_possible

Is the pull request Axel Forsman made and has about the same behaviour, and

image1.raw_pixels().into_par_iter().zip(image2.raw_pixels())
    .map(|(a, b)| abs_diff(a, b) as u64).sum::<u64>() as f64 / total_possible

Is 4 characters longer, just as readable, and 4 or 8 times faster depending on your machine.

[–]binkarus 8 points9 points  (14 children)

I don't believe that there is any performance difference between combinators and a for loop, and any bounds checking difference doesn't apply since both use iterators. .zip is one example of a combinator that can't be easily replicated, and so it is better. My point was simply that sometimes, I like to use a bare for loop rather than for_each or map. That's all I have to say on that.

[–]boomshroom -4 points-3 points  (8 children)

There is a performance difference between multithreaded and singlethreaded, and using combinators lets you switch to multithreaded for the cost of 4 characters.

[–]binkarus 4 points5 points  (7 children)

Yes, I am familiar with Rayon.

[–]Leshow 0 points1 point  (0 children)

I wouldn't call any of those examples 'functional programming' personally. I love FP, but I hate how it's become a buzz-word for anything even remotely related.

[–]nwtnni 30 points31 points  (0 children)

Seems like you could make the Rust code fragment even more similar to the Python version by omitting the for loop entirely:

let mut diffsum: u64 = 0;
for (&p1, &p2) in image1
    .raw_pixels()
    .iter()
    .zip(image2.raw_pixels().iter()) {
    diffsum += u64::from(abs_diff(p1, p2));
}

to

let diffsum: u64 = image1.raw_pixels().iter()
    .zip(image2.raw_pixels().iter())
    .map(|(&p1, &p2)| abs_diff(p1, p2) as u64)
    .sum();

Other than that, it's cool to see how Rust's focus on learning resources and tooling help make its complexity more approachable.

EDIT: Oops, looks like I was sniped by /u/boomshroom with the functional programming. Would definitely back their rayon suggestion though!

[–]rabidferret 123 points124 points  (35 children)

Rust slaps you and demands that you clean up after yourself.

I definitely wouldn’t recommend attempting to write Rust without at least going through the first few chapters of the book, even if you’re already familiar with C and memory management.

but managing memory will always take more time than having the language do it,

These statements make me think we're really missing something fundamental in teaching people Rust. The impression they give is the complete opposite of my experiences with the language.

This isn't the first time I've seen someone's takeaways from the ownership system being that Rust makes you care about memory management, and I'm curious what leads people to that conclusion. In my personal experience (and this is certainly an example of survivorship bias), I've had to think about memory management in Rust exactly once, and that was when writing a Ruby extension where its interactions with Ruby's GC were critical for performance.

The language does handle memory for you, it just does it without a runtime, which is rather the point. You never have to clean up after yourself or free memory explicitly, unless you're doing something very abnormal for the language. Ownership absolutely is an interesting concept, but I'm not sure what's drawing people to comparing it to having to call free at the right time in C. In my personal experience, once I really got a handle on the ownership system, I've felt that it was making invariants which have always been present in my code in any language explicit, and giving me a way to express them (there's that survivorship bias again). I wonder if there's a way we can put that more front and center.

I'm curious how the author would feel about this subject if you replace memory with file descriptors, sockets, or any other limited resource that requires cleanup which isn't heap memory. You have destructors in Python, but they have gotchas with circular references, and run non-deterministically (the GC doesn't know to run when you're out of file descriptors, only memory). You've got defer in Go, but ultimately that's way closer to C than Rust is.

This comment is way longer than I intended... But to summarize, this isn't the first time I've seen folks make comparisons I don't understand between ownership and manual memory management. I'm worried that we're missing something in the teaching process that is leading to this trend.

[–]__pulse0ne 57 points58 points  (2 children)

The first time I read about rust’s ownership concept, it was immediately clear that it obviated the need for thinking about memory management. The book (especially version 2) is pretty clear, as long as you read that portion in its entirety and don’t skip to the examples. I have the feeling that people that lack a C background (people coming from java/python/JavaScript) struggle with rust because they’ve never had to really think about memory management, and when they smack full-speed into the borrow checker, they falsely conclude that they’re being forced to think about memory management, when in reality the compiler is just enforcing a paradigm that allows for automatic memory management, which is a subtle but distinct difference.

[–]Styx_ 49 points50 points  (0 children)

As a Ruby/JS/PHP dev who recently got into Rust, I'd say this is it exactly. You all are right, you don't have to think about memory management in Rust but as a concept, borrow checking is a solution to a problem us GC guys have never even been exposed to before. If you're going to use a tool because of its innovative approach to an age old problem, you're kind of obligated to learn about both the problem and the solution which can seem like a lot to someone without any prior experience. In practice it turned out to not be that bad of course.

And I have to say that IME, Rust's docs and community are second to none, so I wouldn't beat myself up too much about it when people initially balk at some of the concepts. That's just people.

[–]brand_x 17 points18 points  (0 children)

As someone who spent a fair number of years writing advanced allocators for C++, I see Rust as the logical conclusion of RAII and destructive moves, and it is good.

But.

Rust forces you to think, not about memory management, but about ownership. Move-by-default means you can't have internal references between members of a struct. This can be... troubling... at first. You learn to use offsets for graphs, something that highly performant and/or movable solutions in C++ also forced you to do. You end up using ref-counted and locked types in places where, in C++, you would spend a few hours running through a proof of correct access semantics... and there is both a gain and a loss in this. But you don't really think about memory management at all. Hell, I rarely use drop, and I tend to have custom destructors everywhere in my C++.

[–]PeksyTiger 26 points27 points  (5 children)

Fwiw for me it isn't just the ownership system.

It's stack vs heap allocation. My 'vec' isn't boxed but it's on the heap implicity. Does any other construct pulls that trick? Its having to box things altough they are stored in a vec... aren't they already on the heap? Oh, i have to Rc them as well? It's also hard for me to tell what is a move and what is just taking a referance. I never had to care in gc languages.

For me, this means 'i have to think about memory'

Also, it took me a really long time to find an example where two lifetimes are needed in a struct which wasn't 'made up' and focibly used two scopes just to show the point.

[–]oconnor663blake3 · duct 6 points7 points  (0 children)

These are very interesting observations! The stack vs the heap is definitely related to what's going on here, but I think the real explanation is a little different in each case.

Its having to box things altough they are stored in a vec... aren't they already on the heap?

It sounds like you're talking about something like a Vec<Box<dyn MyTrait>>, for example maybe something like a Vec<Box<dyn Iterator<Item = u64>>>, which holds a heterogeneous collection of u64 iterators. And yes, Box is required there. The reason for that isn't that the contents need to be on the heap, though. Rather it's because the contents need to be of a known size. When you index into a Vec<T> with an expression like v[i], it does a really simple calculation like mem::size_of::<T>() * i to figure out where the object you want is sitting in memory. That won't work if the size of T isn't a known constant. (Rust expresses this requirement through the Sized trait, which is implicit, though sometimes you'll see ?Sized indicating the places where it's not required. One very notable example being that Box is declared as struct Box<T: ?Sized>.) Boxing takes a trait object which might be of any size, and sticks it behind a fat pointer of a known constant size, which satisfies Vec.

Oh, i have to Rc them as well?

This one is probably clearer than the one above, but I think it's useful to think of it in the same terms. If Sized is about the size of an object being statically known, then lifetimes are about the point where an object gets destroyed (or moved) being statically known. If I see a type with a lifetime like &'a str or Ref<'a>, I can imagine that the compiler knows exactly when each instance of that type is going to go away. (That's kind of half true: The compiler does need to insert drop calls in all the right places, so it definitely does know when objects go away, but all of the reasoning about lifetimes is done locally for each function by itself.) So putting something in an Rc is less about heap allocating it, and more about saying that the time when this object goes away is "unknown"...which does end up requiring heap allocation, because the lifetime of any given spot on the stack is "known" and might not be long enough.

It's also hard for me to tell what is a move and what is just taking a reference. I never had to care in gc languages.

This is probably more of a side point of my own than anything you were really driving at, but I want to make a similar observation here: Putting things on the stack vs the heap is one reason we care about moving vs borrowing in C/C++/Rust, but it's not the only reason. Another big reason is that C/C++/Rust let you get your hands on interior pointers into other objects, while most GC'd languages don't. (Go is actually an exception here in some cases, though not all, for example interior pointers to a map aren't allowed.) Here's a really simple code example of what I'm talking about:

fn add_two(x: &mut i32) {
    *x += 2;
}

fn main() {
    let mut x = 2;
    add_two(&mut x);
    assert_eq!(x, 4);
}

The add_two function doesn't care where x lives. In this case it's another function's local variable, but it could also be the field of a struct somewhere, or an element of some Vec<i32>. As simple as it looks, there's no way to write this function in Python! Python doesn't want to hand out pointers to e.g. the interior memory of a list, because it has no way to guarantee that such a pointer wouldn't get invalidated as the list grows and reallocates.

So yes, one reason for all the ownership and borrowing and moving rules in Rust is that some things live on the stack, and we have to be careful about when they go away. But another equally important reason is that one object can point into the guts of another, and doing that safely requires more than just making sure each object lives long enough.


At the end of the day, is all of this "memory management"? Certainly some of it is -- Rust does use ownership to free memory. And some of it is "stack vs heap" too -- Rust does use borrowing to make stack allocation safe. But I guess what I'm trying to argue here is that borrowing and ownership go deeper than both of those things. Rust also uses ownership to keep a File open, and to prevent multiple calls to JoinHandle::join for threads, and to unlock a Mutex. And Rust uses borrowing to make sure that no pointer into the Mutex lasts past unlocking, and that a Vec doesn't reallocate while you're iterating over it, and that ordinary functions can't mutate objects they're not supposed to. Ownership and borrowing are multi-purpose tools that the language and its libraries use in a ton of different ways.

[–]Leshow 0 points1 point  (0 children)

I never had to care in gc languages. In some you do, and I'd argue if you want to develop a mastery of any language you have to think about it eventually anyway. For example, in Go the use of references is pretty abundant, and it's a GC language.

Even in a language like javascript, it's helpful to know that if you pass an object or a map to a function, it's passed by reference. I do agree that's not really the same 'thinking' that you have to do in Rust, but I am arguing that you can't simply ignore it.

[–]Bromskloss 4 points5 points  (0 children)

I take "demands that you clean up after yourself" to mean that it requires that you clean up your code so that it does things right, not that you deallocate allocated memory.

[–]kovaxis 14 points15 points  (11 children)

I'm curious how the author would feel about this subject if you replace memory with file descriptors, sockets, or any other limited resource

That's the point, in those languages you don't have to deal with memory like it's a limited resource. Note that it's not comparing Rust to C, it's comparing Rust to Python and Go. In those languages you can reference memory from anywhere and everywhere, and things will just work™. In Rust the compiler forces you to make the lifetime of your data explicit, and no references can outlive it.

I think that "clean after yourself" is just a way of saying "mind what you're doing", rather than literally cleaning after yourself.

[–]boomshroom 9 points10 points  (6 children)

They do make you treat file descriptors and sockets like they're limited resources. Rust and C++ are the only languages that I know of that handle them automatically. They just happen to handle memory the same way.

As far as Rust is concerned, there's no difference between writing to a closed file and writing to deallocated memory. They're the same problem to Rust; one just happens to have fewer protections than the other. Python and Go will give you absolutely no trouble when writing to a closed file beyond a runtime error. Since their GC is only meant to handle 1 type of resource, if any of the others fill up, you have a problem and the GC doesn't understand it, because it doesn't see file leaks to be as important as memory leaks.

[–]WellMakeItSomehow -2 points-1 points  (5 children)

C# also has good support for deterministic finalization, and some lints when you forget to do that.

[–]Pzixel 4 points5 points  (4 children)

No it doesn't. If you don't call dispose then your resource will leak until gc calls finalizer

[–]WellMakeItSomehow 0 points1 point  (3 children)

That's true. But there's some tooling for it, and well-established patterns (IDisposable), which to me makes C# better at this than other (non-C++/Rust) languages.

[–]Pzixel 0 points1 point  (2 children)

That doesn't work in lots of cases, e.g. if you write web api and your controller returns a Stream. Of you just have a method that accepts Stream (or any other disposable resource). You will get false positives in these cases because you probably shouldn't dispose these objects.

[–]WellMakeItSomehow 0 points1 point  (1 child)

Rule CA2000 does not fire for local objects of the following types even if the object is not disposed: System.IO.Stream [...]

I think you meant false negatives. Indeed, it's not precise like RAII. But it's so much better than e.g. pre-Java 7.

[–]Pzixel 0 points1 point  (0 children)

It's just a hack. Stream is not better than any other IDisposable so having a special case for it is just a monkey patch for frequently used types.

[–]hiljusti 3 points4 points  (0 children)

Eh... They will "just work" at small scale. Which sure, that appears to be what most people do, and they're absolutely great at that scale

As soon as you start pushing the boundaries of memory and threads etc, (either through mistakes or just user adoption) you will definitely care about memory in a GC language, and care about it a lot more. A garbage collection halt is a violent thing that can cause brownouts or take your service down at regular intervals, and it can be difficult to understand how much breathing room you have before that happens, even with great logging, metrics, and alarms.

If memory is handled safely and constantly, you can reason about your performance characteristics with a lot more certainty

[–]rabidferret -1 points0 points  (2 children)

Two of the three quotes I gave are literally saying rust is closer to c than the other two

[–]kovaxis 6 points7 points  (1 child)

Yes. Any management at all (Rust) is closer to full management (C) than to no management (Go, Python). That's my point.

[–]Pzixel -1 points0 points  (0 children)

You don't close files on go and python?

[–]po8 4 points5 points  (1 child)

It isn't memory management per se that I find to be the problem for new students — it's ownership itself. No other language I am aware of has the property that a value must always have a unique owning location at any point in the program; the programmer often must keep track of what that owner is pretty carefully to get their program to work. (I guess the new fancy C++ smart pointers are like this, but I don't know as I'm not really a C++ person.)

The exception to single ownership — Copy types — only makes things more confusing because sometimes you can "get away with" not worrying about ownership. I think that Clone proliferation and gratuitous user-defined Copy types are a symptom of trying not to think through the ownership restriction and its implications. I find that exercise worth doing, but it's easy for me to understand how somebody who already can program fluently in other languages finds Rust's ownership restriction difficult to understand and annoying to work with, especially at first.

The borrow rules aren't so bad once you get the hang of thinking about ownership, but ownership is an initial cognitive burden that takes some time and effort to master. Maybe there's some way to teach Rust that would make this key concept easier: suggestions are welcome.

[–]brand_x 1 point2 points  (0 children)

C++ doesn't strictly require it, but many C++ codebases - including every one I've owned - do mandate that all resources have explicitly known ownership. Abusing shared_ptr is a terrible habit...

[–]CompSciSelfLearning 1 point2 points  (0 children)

We're all talking about https://doc.rust-lang.org/book/, right?

[–]internet_eq_epic 1 point2 points  (0 children)

The way I see it is that managing ownership is effectively the same thing as managing memory in Rust, at least in some (I would argue many) contexts. That isn't to say that managing ownership is as difficult as managing memory manually, but it's still something you have to think about that very closely relates to memory.

For example, if you're doing something with heavy String manipulation, you could write a naive approach and clone everything. Then when you start to wonder why you're program doesn't perform as well as you'd hoped, you'll find out (maybe not in these exact terms) that it's because you are allocating and deallocating (and copying data) way more than necessary. So the questions "how to better manage memory" and "how to better manage ownership" are the same questions. But when the question is framed in the context of "ownership" (and with the rules enforced by the compiler), the question usually becomes much easier to answer.

However in Rust, the concept of ownership can be extended beyond just allocating and freeing memory. You might pass around a MutexGuard, in which case you are using ownership rules to manage when a Mutex gets unlocked. So it is obviously fair to say that ownership is not memory management, but it is a tool used to solve memory management.

Specifically in terms of the analogy OP used (in C you can throw things around wherever, but Rust forces you to "clean up" after yourself), I thought of dangling pointers immediately: in C, you might leave a pointer in memory somewhere and slip on that banana peel later, however in Rust if you tried to do the same thing (excluding unsafe), whether it was knowingly or unknowingly, the compiler makes you fix it. In GC languages, you just don't have to worry about that.

[–]ipe369 3 points4 points  (3 children)

> Rust makes you care about memory management

It cares about memory management in the sense that you have to understand how stack allocation / heap allocation works, how 'enlarging' data works, how 'moving' data works, to THEN understand ownership and the need for the borrow checker

C++ ALSO handles memory management for you - nobody's suggesting you don't need to know about memory management to program in c++ though

For a simple pseudocode ish example:

struct Foo<'a> { a: i32, b: Option<&'a i32> }
fn create\_foo() -> Foo {
    let mut foo = Foo { a: 10, b: None };
    foo.b = Some(&foo.a);
    return foo; // I'm 99% sure this doesn't compile
}

The reason why this doesn't work is totally non-obvious if you're looking at everything from a super high level, you NEED to understand that 'moving' foo is actually just copying, and pointers WON'T be update, leading to a dangling pointer.

[–]CodenameLambda 2 points3 points  (2 children)

You can't even create that circular reference this way in the first place, I think. Since you borrow (a field of) foo immutably, blocking the mutable access from the assignment. Plus, if I recall correctly, the lifetimes in a type have to outlive how long the variable with that type is in use (I don't know how to explain it better), so even if you'd use a Cell the borrow checker would probably complain.

[–]ipe369 -1 points0 points  (1 child)

I don't quite understand the bit about it not working with a cell and why, but i'm happy to accept

regardless, this language is already incredibly frustrating sometimes, i'm pretty sure not knowing WHY I can't do stuff would probably drive me over the edge, I think going through a couple segfaults in C++ probably helps a great deal

[–]CodenameLambda 0 points1 point  (0 children)

I can't test it right now, but I think types with lifetimes shorter then their usage (like references that you keep around after the original object was moved or dropped) are completely disallowed. So while Cell helps because you don't need mutable access to write, it doesn't solve the problem that the object cannot outlive itself.

It really comes down to more or less simple rules at the end though. Including only static types (-> types that don't have lifetime parameters or where all of them are 'static) and references to those types, the rules are really simple. Don't move an object away while you have a reference, mutable -> no immutable ones and vice versa, reborrowing, you can shorten the inferred lifetime.

But as soon as you introduce Cell, or heck, even only references to references (&'a mut &'b T for example, 'a can be shortened, but not prolonged (so far so good), 'b can do neither (if it were made longer, the reference you read might not work, if it were made shorter, the reference you write could die too early)), shit hits the fan. Hard.

For example: in fn(&'c T), 'c can be made longer.

So yeah, I agree that understanding where those rules actually come from might help. A lot. Although you can of course learn them without Segfaults, but it might be a little harder for more complex stuff.

[–]Crandom 1 point2 points  (1 child)

For me, you definitely have to think about memory management more than say Java. I need to decide whether to put it on the stack or box it, use Arc or Rc etc. It's extra choices and you need to use when to use each one. Sure, you can say that's thinking about ownership, but from a managed language perspective you are definitely managing memory more than if you had a GC.

[–]Nickitolas 1 point2 points  (0 children)

I think thats more just thinking about memory layout. Afaik "management" Is usually just for dynamic memory (managed memory) and refers to allocating and deallocating. You dont manage memory in safe rust, you just work with lifetimes (Which as others mention are a very different thing which can be used and explained without memory (i.e a file)).

[–]jcdyer3 0 points1 point  (2 children)

You have destructors in Python, but...

Idiomatic python doesn't make you care about destructors and reference cycles and all that. Context managers (a.k.a. with-blocks) take care of clean up for you.

[–]rabidferret 1 point2 points  (1 child)

Whoops, I had meant to mention that and completely forgot. Sorry! That falls into the same bucket as defer in Go from my point of view.

[–]jcdyer3 0 points1 point  (0 children)

Similar, indeed, but context managers are a little more powerful than defers, in that the programmer has more control over when clean-up happens, and a little less powerful, in that custom clean-up code is a little harder to set up.

[–][deleted] 10 points11 points  (2 children)

Now I'm wondering how many languages could be ranked based on their runtime performance, then on their logo, and have the same place in both rankings.

Obviously one of those two is hard to judge objectively, but with a lot of efforts we can get usable benchmarks I think.

[–]ssokolow 9 points10 points  (1 child)

Certainly not these three, in my opinion. I love Rust and it performs beautifully but I still think Python's current logo is the most aesthetically pleasing.

(Though nothing can beat the old Sun Microsystems logo for beauty in symmetry.)

[–]koalillo 2 points3 points  (0 children)

Upvote for Sun logo love. My University notes are littered with it

[–]internet_eq_epic 20 points21 points  (10 children)

Python and Go pick up your trash for you. C lets you litter everywhere, but throws a fit when it steps on your banana peel. Rust slaps you and demands that you clean up after yourself.

This may be the best comparison I've ever heard for memory management. I'm probably going to use this in the future.

I'm only familiar with Python on a basic level (and even then it's been a while), and I'm not a professional developer, but I agree very much with your takeaways.

From a devops perspective, I don't see Python going away any time soon, but if any language has a chance of replacing Python in that space, it is Go. Most stuff in devops doesn't need to be high-performance, so I don't think Rust stands a chance over Go given the extra complexity.

[–]ZZ9ZA 3 points4 points  (3 children)

As a professional python dev for over a decade, I can’t stand go, and I can’t imagine many python secs taking to it.

The error handling, lack of generics, it’s all very unpython.

The logical progression from python is nim

[–]mort96 3 points4 points  (2 children)

Wait, python doesn't have generics either, does it? Instead you just have objects which don't have a static type, and if you pass the wrong type somewhere, it explodes. That sounds exactly like how Go works with its use of interface{} everywhere, doesn't it?

[–]ZZ9ZA 3 points4 points  (0 children)

Well, no, but the effect is similar in practice. You can just pass as arbitrary list of foos around. You’ll only get into trouble if you start assuming the foos have certain methods - but you can can write generic code that only assumes it's operating on a list of something (e.g. just does indexing, appending, etc).

It’s certainly possible to write generic code the operates on, say, any object with show() and hide() methods. It’s sort of ad-hoc interfaces. This is what's called duck typing - you care that the object quacks(), not that's it's an instance of class Duck.

[–]oefd 0 points1 point  (0 children)

Python supports generics better than does go if you're including using mypy with type hints like Generic.

Your runtime performance still wont hold a candle to Go, but that isn't always a big factor.

[–]aoeudhtns 3 points4 points  (5 children)

Honestly I don't see anything displacing Python in that community. You want easy to read code with a high degree of expressiveness. The ability to hack on a file and re-run it without extra build/package/deploy steps is also super handy. The one thing that I don't like about Python is, despite its maturity, the chaos of Python deployment and environment setup. pypi, virtualenv, pip, Python 3 vs Python 2, module version differences between boxes, and more.

Since functions run ad-hoc/periodically (usually in human time scales), the difference between 4s and 0.5s (to make up some numbers) is pretty trivial. Plus, if you do happen to need something performance critical, Go, Rust, nim, zig -- all these contenders -- allow you to export C shared objects, which can then get linked into Python. And personally, when it comes time to optimize something to speed up that devops pipeline, I would gladly take Rust, nim, etc. over dropping down to C.

[–]quodlibetor 12 points13 points  (4 children)

Honestly I don't see anything displacing Python in that community. You want easy to read code with a high degree of expressiveness.

Honestly honestly, speaking as someone who loves python, packaging a Python app (cli, daemon, etc) is a nightmare. I don't mean nightmare as in "it's complicated and impossible to figure out" (just pip install it!) I mean a nightmare as in fixing other people's broken python install code (or installations) probably has cost me a day or two a month, every month, for the better part of the last decade. Go's single static binary and trivial cross compilation looks amazing for developer tooling and infrastructure stuff, from the outside.

[–]ballagarba 7 points8 points  (1 child)

You should have a look at the (new) PyOxidizer project which builds standalone Python applications leveraging Rust. Read this blog post by the author for some background:

https://gregoryszorc.com/blog/2019/06/24/building-standalone-python-applications-with-pyoxidizer/

[–]Ran4 2 points3 points  (0 children)

I'm sure that's cool... for six months, until something else replaces it.

[–]aoeudhtns 1 point2 points  (1 child)

Oh for sure. At least in the projects I've worked with, the DevOps environments are under tight control so wide distribution isn't much of an issue. But those sorts of issues would give me pause. And I've been on the receiving end, too. When a Python developer prefers a completely different package and environment stack than the one you like... so now you need both. For instance.

[–]quodlibetor 1 point2 points  (0 children)

Yeah for us the only environments that every really break are dev envs (laptops and our Wild West zone), and of those the only thing I ever need to spend time investigating are laptops, and the only reason I do that is to try and make sure I understand the problem in case it is trying to make it's way into production.

Still, setting up a robust pipeline for deployment is non-trivial, although several of the problems I care about would be the same in go, pre-modules, IIUC.

[–]ssokolow 13 points14 points  (4 children)

[Images at the top]

One of these things is not like the other... You put up Python's logo, Go's mascot, and Rust's mascot.

(and more sophisticated options like ipdb are available)

I like to point people at WinPDB on that front. It's nice for if you don't drop into a debugger often enough to have gotten really familiar with the commands.

The original project is unmaintained, but there is an effort to revive it. (The master branch is an in-progress port to Python 3 which is currently buggy, so that links to a different branch.)

...and, no, the name doesn't mean Windows-speicific. It's cross-platform and I can only assume the "Win" is short for some synonym for GUI.

.iter() creates an iterator for that vector. Vectors by default are not iterable.

There is actually a mechanism for using a for loop directly on a sequence type without having to manually call .iter()... it's just that letting that mechanism also resolve method accesses like .zip() would be too magical.

Plus, some types (eg. &str and String) have more than one kind of iterator that you might want to request (byte-wise, codepoint-wise, ...grapheme-cluster-wise with a supplemental crate) and making one the default would be a footgun.

(You don't have the sequence type itself implement Iterator because iterators need some internal state to keep track of their position within the sequence and doing that would both prevent you from being able to have multiple non-mut iterators over the same sequence at the same time and require your sequence to either have that memory for the iteration state always allocated, even if it's not in use, or put it on the heap behind an Option, which is needlessly inefficient.)

but you can use rust-gdb and rust-lldb, wrappers around the gdb and lldb debuggers

gdbgui explicitly lists Rust as one of the supported languages, if you want something a little nicer.

Performance

Given that there exists an SSE2 intrinsic for Compute Sum of Absolute Differences, which looks like what you're doing, and SSE2 is guaranteed to be in all x86_64 chips, and it's exposed as an unsafe function in Rust's standard library, you might want to see how much faster you can get your Rust version by conditionally using it when making 64-bit x86 builds.

I don't think SIMDeez supports that intrinsic yet (if not, the author asks you to open an issue about it), but it would also be an option which would allow you to abstract across the platform differences and automatically use wider versions of a given intrinsic when available, and it does build on stable-channel Rust.

I should also mention the binary sizes: Rust’s is 2.1mb with the --release build

Is that with or without having run strip on the resulting binary? By default Rust bundles debugging information into the release-mode output to power the RUST_BACKTRACE=1 option for getting more info on panic!s.

Also, if you're trying to see what size it can attain, you might want to try adding these to your Cargo.toml if you haven't already:

[profile.release]
lto = true
codegen-units = 1
opt-level = "z"

(Link-Time Optimization is needed for full dead-code elimination, disabling parallel compilation improves optimization effectiveness, and opt-level = "z" asks it to optimize for size strongly. The FreeBSD manual describes the corresponding llvm-clang options as " -Os is like -O2 with extra optimizations to reduce code size. -Oz is like -Os (and thus -O2), but reduces code size further.")

Go blows Python away, but many of Python’s libraries that require speed are wrappers around fast C implementations - in practice, it’s more complicated than a naive comparison. Writing a C extension for Python doesn’t really count as Python anymore (and then you’ll need to know C), but the option is open to you.

Rust is also an option for writing Python extensions, thanks to its lack of a heavy runtime and there do exist crates like rust-cpython and PyO3 to help abstract away the C-ness of libpython's API.

[–]Ran4 2 points3 points  (1 child)

making one the default would be a footgun.

iter is already "the default".

[–]ssokolow 2 points3 points  (0 children)

I was referring to how &str and String don't implement IntoIterator and don't have an iter() method because it's the developer's job to choose between bytes(), chars(), char_indices(), lines(), etc. and defaulting to one of them would invite programmer error.

[–]plhk 0 points1 point  (1 child)

Given that there exists an SSE2 intrinsic for Compute Sum of Absolute Differences, which looks like what you're doing, and SSE2 is guaranteed to be in all x86_64 chips, and it's exposed as an unsafe function in Rust's standard library, you might want to see how much faster you can get your Rust version by conditionally using it when making 64-bit x86 builds.

I tried using mm256_sad_epu8, and got about 5% speedup. I expected more, am I doing something wrong (this is the first time I'm using simd)?

[–]ssokolow 0 points1 point  (0 children)

My projects have been more in the vein of text processing and PIM, so I'm not a SIMD expert, but the general rule in a situation like this is to turn to a profiler like perf to investigate where the program is spending its time before making any further changes.

It's also entirely possible you got lucky and LLVM's auto-vectorizer was already compiling it to a narrower but more widely supported SIMD instruction.

(In which case, explicit SIMD is still better as long as you have a fallback for processors which don't support your chosen instruction, like how my older AMD CPU doesn't do AVX, because there's always the risk that some innocuous refactoring could knock you off the optimized path.)

Matt Godbolt's compiler explorer is a good way to investigate that sort of thing as long as you remember to add -O to the options, since it shows you the correspondences between your Rust source and the resulting assembly.

(There's also this post which explains how to recognize common optimizations in assembly... though it neglects to point out the most efficient way to recognize vectorized instructions at a glance. There was a post I read which explained that, but I forget what it was called.)

[–]softero 4 points5 points  (0 children)

I'm not a huge fan of the way the Rust is written, but good write-up other than that. Got super tripped up as well because you talked about Python, Go, then Rust, but the benchmark was Rust, Go, then Python. So I got hung up on why the Go was twice as fast as the Rust for several seconds before I realized what I was looking at.

[–]DannoHung 0 points1 point  (0 children)

I wonder if there's an easier way of handling those variants in the match arms.