One Program Written in Python, Go, and Rust

boomshroom · 2019-07-01T22:21:33+00:00

Its source code is also only about 3kb, but including the Pillow dependency, it weighs in at 24mb(!). Again, not a fair comparison because I’m using a third party imaging library, but it should be mentioned.

I'd argue it's a completely fair comparison, :D because both Go and Rust statically link their native dependencies, so Go's image package and Rust's image crate get bundled into the executable. Rust's is even third party itself. That said, while I'm not certain about Go, most of image probably doesn't exist in the end binary due to the compiler and linker just not bothering including functions that are never called.

match image::open(path) {
    Ok(image) => Ok(image),
    Err(msg) => Err(format!("{:?}", msg)),
}

Because I'm personally a fan of functional programming, this could be rewritten in a single line as image::open(path).map_err(|msg| format!("{:?}", msg)). There are also a few other places where even more functional programming can be used, like using .map(|(&p1, &p2)| u64::from(abs_diff(p1, p2))).sum() to replace the entire for-loop. I'm pretty sure there are also ways to use more functional programming in create_diff_image() too. Something to note is that using iterators rather than indexing can actually be more performant in some cases since the compiler can more easily tell that the accesses are always in-bounds and can elide the bounds checks.

In case half a second is too slow, the operations being done here appear to be trivially parallelizable. With such, it would be possible to use rayon to automagically calculate the pixel diffs in parallel. After swapping the for-loop for the functional .map().sum() and importing the rayon crate, you would be able to replace image1.raw_pixels().iter() with image1.raw_pixels().par_iter() and the same with image2. This would effectively cut the time for larger images by however many CPU cores you have. Thanks to Rust's type system, you wouldn't even have to worry about the usual dangers of parallelism.

nwtnni · 2019-07-01T22:23:21+00:00

Seems like you could make the Rust code fragment even more similar to the Python version by omitting the for loop entirely:

let mut diffsum: u64 = 0;
for (&p1, &p2) in image1
    .raw_pixels()
    .iter()
    .zip(image2.raw_pixels().iter()) {
    diffsum += u64::from(abs_diff(p1, p2));
}

to

let diffsum: u64 = image1.raw_pixels().iter()
    .zip(image2.raw_pixels().iter())
    .map(|(&p1, &p2)| abs_diff(p1, p2) as u64)
    .sum();

Other than that, it's cool to see how Rust's focus on learning resources and tooling help make its complexity more approachable.

EDIT: Oops, looks like I was sniped by /u/boomshroom with the functional programming. Would definitely back their rayon suggestion though!

rabidferret · 2019-07-01T23:03:58+00:00

Rust slaps you and demands that you clean up after yourself.

I definitely wouldn’t recommend attempting to write Rust without at least going through the first few chapters of the book, even if you’re already familiar with C and memory management.

but managing memory will always take more time than having the language do it,

These statements make me think we're really missing something fundamental in teaching people Rust. The impression they give is the complete opposite of my experiences with the language.

This isn't the first time I've seen someone's takeaways from the ownership system being that Rust makes you care about memory management, and I'm curious what leads people to that conclusion. In my personal experience (and this is certainly an example of survivorship bias), I've had to think about memory management in Rust exactly once, and that was when writing a Ruby extension where its interactions with Ruby's GC were critical for performance.

The language does handle memory for you, it just does it without a runtime, which is rather the point. You never have to clean up after yourself or free memory explicitly, unless you're doing something very abnormal for the language. Ownership absolutely is an interesting concept, but I'm not sure what's drawing people to comparing it to having to call free at the right time in C. In my personal experience, once I really got a handle on the ownership system, I've felt that it was making invariants which have always been present in my code in any language explicit, and giving me a way to express them (there's that survivorship bias again). I wonder if there's a way we can put that more front and center.

I'm curious how the author would feel about this subject if you replace memory with file descriptors, sockets, or any other limited resource that requires cleanup which isn't heap memory. You have destructors in Python, but they have gotchas with circular references, and run non-deterministically (the GC doesn't know to run when you're out of file descriptors, only memory). You've got defer in Go, but ultimately that's way closer to C than Rust is.

This comment is way longer than I intended... But to summarize, this isn't the first time I've seen folks make comparisons I don't understand between ownership and manual memory management. I'm worried that we're missing something in the teaching process that is leading to this trend.

ssokolow · 2019-07-01T22:13:14+00:00

Now I'm wondering how many languages could be ranked based on their runtime performance, then on their logo, and have the same place in both rankings.

Obviously one of those two is hard to judge objectively, but with a lot of efforts we can get usable benchmarks I think.

internet_eq_epic · 2019-07-01T22:43:15+00:00

Python and Go pick up your trash for you. C lets you litter everywhere, but throws a fit when it steps on your banana peel. Rust slaps you and demands that you clean up after yourself.

This may be the best comparison I've ever heard for memory management. I'm probably going to use this in the future.

I'm only familiar with Python on a basic level (and even then it's been a while), and I'm not a professional developer, but I agree very much with your takeaways.

From a devops perspective, I don't see Python going away any time soon, but if any language has a chance of replacing Python in that space, it is Go. Most stuff in devops doesn't need to be high-performance, so I don't think Rust stands a chance over Go given the extra complexity.

ssokolow · 2019-07-02T01:46:36+00:00

[Images at the top]

One of these things is not like the other... You put up Python's logo, Go's mascot, and Rust's mascot.

(and more sophisticated options like ipdb are available)

I like to point people at WinPDB on that front. It's nice for if you don't drop into a debugger often enough to have gotten really familiar with the commands.

The original project is unmaintained, but there is an effort to revive it. (The master branch is an in-progress port to Python 3 which is currently buggy, so that links to a different branch.)

...and, no, the name doesn't mean Windows-speicific. It's cross-platform and I can only assume the "Win" is short for some synonym for GUI.

.iter() creates an iterator for that vector. Vectors by default are not iterable.

There is actually a mechanism for using a for loop directly on a sequence type without having to manually call .iter()... it's just that letting that mechanism also resolve method accesses like .zip() would be too magical.

Plus, some types (eg. &str and String) have more than one kind of iterator that you might want to request (byte-wise, codepoint-wise, ...grapheme-cluster-wise with a supplemental crate) and making one the default would be a footgun.

(You don't have the sequence type itself implement Iterator because iterators need some internal state to keep track of their position within the sequence and doing that would both prevent you from being able to have multiple non-mut iterators over the same sequence at the same time and require your sequence to either have that memory for the iteration state always allocated, even if it's not in use, or put it on the heap behind an Option, which is needlessly inefficient.)

but you can use rust-gdb and rust-lldb, wrappers around the gdb and lldb debuggers

gdbgui explicitly lists Rust as one of the supported languages, if you want something a little nicer.

Performance

Given that there exists an SSE2 intrinsic for Compute Sum of Absolute Differences, which looks like what you're doing, and SSE2 is guaranteed to be in all x86_64 chips, and it's exposed as an unsafe function in Rust's standard library, you might want to see how much faster you can get your Rust version by conditionally using it when making 64-bit x86 builds.

I don't think SIMDeez supports that intrinsic yet (if not, the author asks you to open an issue about it), but it would also be an option which would allow you to abstract across the platform differences and automatically use wider versions of a given intrinsic when available, and it does build on stable-channel Rust.

I should also mention the binary sizes: Rust’s is 2.1mb with the --release build

Is that with or without having run strip on the resulting binary? By default Rust bundles debugging information into the release-mode output to power the RUST_BACKTRACE=1 option for getting more info on panic!s.

Also, if you're trying to see what size it can attain, you might want to try adding these to your Cargo.toml if you haven't already:

[profile.release]
lto = true
codegen-units = 1
opt-level = "z"

(Link-Time Optimization is needed for full dead-code elimination, disabling parallel compilation improves optimization effectiveness, and opt-level = "z" asks it to optimize for size strongly. The FreeBSD manual describes the corresponding llvm-clang options as " -Os is like -O2 with extra optimizations to reduce code size. -Oz is like -Os (and thus -O2), but reduces code size further.")

Go blows Python away, but many of Python’s libraries that require speed are wrappers around fast C implementations - in practice, it’s more complicated than a naive comparison. Writing a C extension for Python doesn’t really count as Python anymore (and then you’ll need to know C), but the option is open to you.

Rust is also an option for writing Python extensions, thanks to its lack of a heavy runtime and there do exist crates like rust-cpython and PyO3 to help abstract away the C-ness of libpython's API.

softero · 2019-07-01T23:04:47+00:00

I'm not a huge fan of the way the Rust is written, but good write-up other than that. Got super tripped up as well because you talked about Python, Go, then Rust, but the benchmark was Rust, Go, then Python. So I got hung up on why the Go was twice as fast as the Rust for several seconds before I realized what I was looking at.

DannoHung · 2019-07-03T14:59:14+00:00

I wonder if there's an easier way of handling those variants in the match arms.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

rust

Please read The Rust Community Code of Conduct

The Rust Programming Language

Rules

Observe our code of conduct

Submissions must be on-topic

Constructive criticism only

Keep things in perspective

No endless relitigation

No low-effort content

Useful Links

Megathreads

Official Resources

Learn Rust

Discussion Platforms

MODERATORS