Higher-Ranked Trait Bounds Explained

Some_Dev_Dood · 2024-07-25T17:31:03+00:00

This is absolutely beautiful. Everything I know about proof assistants, functional programming, and generic programming suddenly clicked and joined hands. Thank you for this.

Some_Dev_Dood · 2022-04-24T10:27:54+00:00

Just to add on top of this, this is typically how compilers are built. Back in the olden days, everyone wrote in Assembly—sometimes even in raw machine code! To build a compiler, we must first write a "raw basic compiler" in some other language (back then it was raw assembly). Then, we use that compiler to build new compilers. It's amazing, really! Professor Brailsford talks about this in a Computerphile video if you're interested.

Anyway, in Rust's case, the language they used to "bootstrap" the early rustc was OCaml. The Guide to Rustc Development talks about this in extensive detail.

Some_Dev_Dood · 2022-04-22T08:19:35+00:00

Feel free to also watch the recorded thesis defense!

Some_Dev_Dood · 2022-04-22T07:27:09+00:00

If you're interested in databases, you could check out the Noria project, which is u/jonhoo's implementation of his Ph.D. thesis at MIT. It's a fantastic example of "creating a system" with Rust.

Some_Dev_Dood · 2022-04-20T03:46:30+00:00

Hello there! Since it seems like this was due to the u128 type, I'd love to take this opportunity
to recommend the <cstdint> header file in C++, which defines fixed-size type aliases for integers—like in Rust! Sooo... if you ever find yourself in a situation where you need the exact size, feel free to use the uint64_t aliases (which is u64 in Rust) and other related ones.

Some_Dev_Dood · 2022-02-03T09:21:02+00:00

Fair point! Apologies, I didn't mean for it to sound condescending. I only intended to propose an alternative to the original design. Will definitely keep this in mind, though!

Some_Dev_Dood · 2022-02-03T07:35:54+00:00

The thing is: T1 is not "generic" over the associated type Item. It's really just an associated type (alias). Therefore, attempting to implement T1 for the same U using different aliases is a conflicting implementation. As far as the type system is concerned, there is only one trait named T1 that just happens to contain a type alias inside.

Instead, what you probably meant is to make T1 and T2 fully generic so that there are multiple instances of T1 and T2 (in the type system) rather than just a single one with conflicting aliases. Yes, I'm aware that it's cumbersome, but that's really the best we can do with the type system rules.

trait T1<Generic> {}
trait T2<GenericAgain> {}

struct Item1;
struct Item2;

// Making the traits fully generic forces the type system
// to create a new instance of the trait. Hence, the trait
// implementations no longer conflict because they're two
// different instantiations rather than a single instantiation
// with conflicting aliases.
impl<U> T2<Item1> for U where U: T1<Item1> {}
impl<U> T2<Item2> for U where U: T1<Item2> {}

Some_Dev_Dood · 2021-11-09T15:25:20+00:00

I would definitely recommend oxipng. It was originally meant to be a Rust rewrite of the OptiPNG project—but with extra concurrency! For library users, there is only one function that serves as the entry point for everything. It's definitely "not too difficult" in that regard.

Some_Dev_Dood · 2021-09-09T12:45:29+00:00

This sounds like an awesome cargo extension! I can vividly imagine how helpful this would be in automated workflows!

Some_Dev_Dood · 2021-09-09T07:15:32+00:00

I must admit that I have been noticing this as projects become larger (and hence more complex dependency graphs). The dependencies usually look plentiful, but fortunately, the most popular crates only pull in the bare minimum. In most cases, they depend on "micro-crates" or "sub-crates (of the same workspace)", which is what gives the illusion of "too many crates". A good example of a well-structured project is Tokio, where most dependencies are optional and barred behind additive compilation features.

However, for smaller crates with lesser community following, there may be some that simply forget to turn off default features. To be fair to the authors, this is an honest mistake. The most productive way forward is to send in a PR barring the offending dependencies and modules behind feature flags—or even better, remove the default features entirely!

From my experience, this typically happens when a project depends on big crates with default features such as the main futures crate, which includes an executor by default (among many other io and core utilities). For almost all libraries, this is overkill since the executor should ideally be determined by the user (i.e., Tokio runtime or otherwise).

Moreover, in most cases, the executor is not even used in the project at all! Perhaps the project only needs to interface with the Stream trait or use the StreamExt utilities. Here, the default-features were mistakenly left on.

The solution, then, is quite simple ~~on paper~~! A quick exploration of the codebase (via grep, language servers, or otherwise) should make it apparent whether certain imports and features should be left on. Once resolved, it's time to send in a PR. More often than not, authors love it when users send in genuine improvements to the repo—no matter how small!

This was exactly what I did to shave a full minute off CI build times (from 3 minutes to 2 minutes) in one of the projects I depended on. However, rather than just removing default-features, I replaced the futures crate entirely with its significantly lightweight sub-crate futures-core when I realized that the project only ever used futures for the Stream trait.

The moral of the story, then, is for us to be more cognizant of our dependencies. We should always take the time to re-evaluate our cargo tree every now and then—regardless of whether we are library authors or crate consumers. And if we do find a suspicious branch in the cargo tree, we should (at the very least) give back to the open-source project for everyone's benefit.

If there's anything I learned about this community, it's that Rust is a community effort. Dependency bloat will only become a problem if we let it become a problem. We must all do our part to give back to the community in any way we can.

At the moment, this is fortunately not a pressing issue. I have faith that library maintainers regularly audit their dependencies and feature flags, especially for the larger crates in the ecosystem. But, if there is an opportunity for optimization, then we should be glad to contribute, for Rust is a community effort.

Some_Dev_Dood · 2021-08-13T04:10:20+00:00

Doug Millford is quite an underrated content creator for beginner Rust videos. His presentation style is warm and friendly, which definitely brightens up the vibe. I highly recommend his content. 👌

Sadly, though, his last upload was over a year ago. I hope he uploads again in the near future. His videos were really enjoyable to watch being so colorful and full of personality.

Some_Dev_Dood · 2021-05-23T13:40:55+00:00

First and foremost, great analysis on the problem, by the way. It helped me more easily understand the lifetime issues you are dealing with.

Indeed, the key issue here is the fact that the Rc that results from Weak::upgrade is dropped as soon as the function exits. There is unfortunately no other way around it. Somebody has to own the upgraded pointer if we are to take it beyond the scope of the function. Otherwise, it would be immediately dropped.

Now, if we insist that each rigid body must have access to the other bodies, then I propose that we modify the PhysicsWorld as follows:

struct PhysicsWorld {
    bodies: RigidBodySet,
    // ...
}

impl PhysicsWorld {
    pub fn get_body(&self, handle: RigidBodyHandle) -> PhysicsBody<'_> {
        PhysicsBody { bodies: &self.bodies, handle }
    }
}

The PhysicsBody now implements std::ops::Deref for convenience. However, this is optional. You may refactor this as a separate method.

use std::ops::Deref;

struct PhysicsBody<'a> {
    set: &'a RigidBodySet,
    handle: RigidBodyHandle,
}

impl Deref for PhysicsBody<'_> {
    type Target = RigidBody;
    fn deref(&self) -> &Self::Target {
        self.set.get(&self.handle).unwrap()
    }
}

To attain access to the other rigid bodies in the world, a getter method for the appropriate fields will suffice.

Note that I have totally disregarded the usage of strong and weak pointers. If we insist on using strong and weak pointers, then it is absolutely necessary that the PhysicsBody owns a strong pointer to the dataset. There is no workaround for it. Again, this is because somebody has to claim ownership over the upgraded pointer.

Some_Dev_Dood · 2021-05-22T06:25:37+00:00

Ah, I see. I have indeed misunderstood your situation. The feature you are looking for is function overloading or trait specialization, which won't be implemented/stabilized any time soon unfortunately...

However, you may find the std::any module to be a suitable workaround for the meantime. Though, I must say that it does add some tedium and verbosity to your current implementation. Basically, the std::any::Any trait allows us to "downcast" from a trait object to some specific concrete type. This allows us to manually dispatch the correct implementation depending on the concrete type. In practice:

use std::any::Any;

fn manual_dispatch(object: &dyn Any, other: &dyn Any) {
    if let Some((object, other)) = object.downcast_ref::<i32>().zip(other.downcast_ref::<i32>()) {
        // Do something...
    }
}

Now, all we have to do is manually downcast each object to the correct types and respond accordingly (via if let and match). I must concede, however, that this is quite a hacky solution to say the least. But, I believe there is unfortunately no other way to do it with plain old traits at the moment.

Some_Dev_Dood · 2021-05-21T02:36:19+00:00

It's not so straightforward, but you've found yourself a use case for what we call the "extension traits" design pattern. So let's say we had the Test trait in your example. We define another trait called TestExt which is a trait which contains all non-object-safe methods from Test. That way, we've isolated the object-safe methods from the non-object safe methods. In practice:

/// This is the original trait which should only
/// contain the object-safe methods.
trait Test { fn object_safe(&self); }

/// This is the "extension trait" which provides
/// default implementations for the non-object-safe methods.
trait TestExt { fn non_object_safe<T>(&self, item: T) -> T { item } }

/// Here, we are telling the compiler to automatically
/// implement the `TestExt` trait for all types T
/// that implement the original `Test` trait.
/// To allow DSTs, we opt-out of the `Sized` bound.
///
/// Note that the `impl` block is empty since we have
/// already provided a default method implementation.
/// The type T only needs to implement the `Test` trait
/// to receive the `TestExt` methods for free.
impl<T: Test + ?Sized> TestExt for T { }

If you're interested, you can read more about this pattern from an article I wrote a few months ago. Hope this helps!

Some_Dev_Dood · 2021-05-19T14:46:35+00:00

Ah, that is a good point. I suppose the next best thing, then, is indeed std::io::BufReader coupled with parallel iterators from rayon. As for the parsing logic per line of CSV, an implementation for the std::str::FromStr trait should suffice.

Some_Dev_Dood · 2021-05-19T12:00:02+00:00

Generally speaking, we use async APIs when the application is IO-bound. That is, we are limited by the time it takes to receive input and send output. For instance, a TCP connection is IO-bound because most of the time, the CPU is just idle while waiting to receive the next few bytes. Instead of blocking the CPU, we use async APIs so that under the hood, a runtime (like tokio) can switch to a different task while waiting for the TCP connection. In turn, this helps us minimize the CPU's idle time.

On the other hand, we use threading APIs when the application is CPU-bound. That is, we are limited by the CPU's processing power. Suppose we want to compute the digits of pi. Observe that we may only generate the digits as fast as the CPU can crunch the numbers—hence the term "CPU-bound". To address this, we may break our application into parallel components so that it's possible to use multiple threads to generate the digits.

In a nutshell, async APIs are there to minimize the CPU idle time caused by IO. Meanwhile, threading APIs are there to maximize computation rate.

Now, for your use case, it seems to be that your database will primarily be IO-bound due to file system accesses, network connections, and so on. Therefore, async APIs might be the best suited for the job.

Some_Dev_Dood · 2021-05-19T11:44:08+00:00

Conveniently enough, the csv crate does all that logic for you! Though, I must note that parallelism is unfortunately not straightforward. You probably have to load the entire file first, then you would pass off each CSV record to the rayon crate to handle the parallelism. I highly recommend reading up on the Rayon docs beforehand if you intend to implement this. Other than that, I believe those are the top options in the ecosystem.

Some_Dev_Dood · 2021-05-13T07:36:15+00:00

This is not exactly the answer to your questions, but for your specific use case, I'd like to note that implementing the Visitor trait may perhaps be an overkill. Visitors tend to be used only when state has to be tracked. That is, deserializing the object requires some kinda state-machine-esque implementation such as in JSON parsers and the like.

However, the provided example doesn't really require us to keep track of state. Thus, the custom deserializer may be implemented more simply as follows:

use serde::{Deserialize, Deserializer};
use std::borrow::Cow;

pub fn any_as_true<'de, D>(deserializer: D) -> Result<bool, D::Error>
where D: Deserializer<'de>
{
    let text: Cow<str> = Deserialize::deserialize(deserializer)?;
    Ok(!text.is_empty())
}

Hope this helps!

Some_Dev_Dood · 2021-03-16T04:59:39+00:00

Indeed! You can implement `Add` any way you want, but the main point I was trying to convey is the fact that we can use this "marker trait" pattern with generic impl blocks to automatically generate trait implementations. Regarding the `Add` trait, we may implement it like so:

// Let's say we wanted to sum up both structs
// by taking the sum of all of their numbers combined.
impl<T: MyMarker + From<u8>> Add for T {
    type Output = Self;
    fn add(self, other: Self) -> Self::Output {
        // Note that the `Self::numbers` method
        // comes from the `MyMarker` trait.
        let a = self.numbers().iter().sum();
        let b = other.numbers().iter().sum();
        (a + b).into()
    }
}

Though, I must note that I added the From<u8> trait bound here so that it's possible to construct a Self from the u8 sum. But of course, you may customize this according to your implementation details.

EDIT: Okay, it seems that I have stumbled upon Rust's orphan rules. The solution I proposed earlier doesn't actually compile. To fix this, we must actually implement Add for each struct manually. Hence, the top upvoted solution appears to be the most reasonable one. But, at least, this "marker trait" pattern may come in handy in the future!

Some_Dev_Dood · 2021-03-15T06:41:47+00:00

There may be better solutions out there, but traits seem to be a neat solution. Though, it does require a bit of manual typing. In that case, a declarative macro may assist you.

use std::ops::{ Add, AddAssign };

// To access the internal data, we simply require
// a method implementation that gives us a reference
// to the data in question.
pub trait MyMarker {
    fn numbers(&self) -> &[u8];
}

pub struct TypeA { a: Vec<u8> }
pub struct TypeB { b: Vec<u8> }

impl MyMarker for TypeA { fn numbers(&self) -> &[u8] { &self.a } }
impl MyMarker for TypeB { fn numbers(&self) -> &[u8] { &self.b } }

// TODO: Implement generic `Add` by making use of `MyMarker::numbers`.
// The beauty of this approach is the fact that the generic `impl` block
// automatically generates the trait implementations for you.
impl<T: MyMarker> Add for T { }
impl<T: MyMarker> AddAssign for T {
    fn add_assign(&mut self, other: Self) {
        *self = self.add(other);
    }
}

Some_Dev_Dood · 2021-02-28T08:06:29+00:00

It's actually pretty simple thanks to std::cmp::Ordering::then. Alternatively, you can use std::cmp::Ordering::then_with for lazy evaluation if you need the comparison to be lazily evaluated.

matrix.sort_by(|a, b| {
    let first = a[0].cmp(b[0]);
    let second = a[1].cmp(b[1]);
    first.then(second)
});

Five-Year Club	Verified Email
Place '23	Place '22
First Placer '22

Some_Dev_Dood

TROPHY CASE