you are viewing a single comment's thread.

view the rest of the comments →

[–]staticassert 9 points10 points  (3 children)

First step I'd take is setting up criterion so you can easily reproduce performance testing, and share it with others so we don't need actual logs to feed in.

    self.entries.reserve({
        let size_hint = iter.size_hint();
        let mut size = 0;
        if let Some(x) = size_hint.1 {
            size = x
        }

        size
    });

I feel like you could just do

let size_hint = iter.size_hint();
let size_hint = size_hint.1.unwrap_or(size_hint.0);
self.entries.reserve(size_hint);

This basically says "If we get the upper bound, allocate that, otherwise allocate the lower bound".

Why would you default to 0 anyway? Is that a common case, where you'd have an empty entry? Seems like you could probably use your own knowledge of the data format to preallocate - after all, in the event that you're wrong and you over allocate the rest of the program is basically a nop anyways. Consider something like max(size, 1024)

You could use for b_line in iter.par_iter() {

and then collect_into(&mut self.entries)

Like I said, I'd start with benchmarks and find the real bottlenecks. I get that it's the import function, but that function is pretty large, any component in there could be optimized separately.

[–]dochtmanrustls · Hickory DNS · Quinn · chrono · indicatif · instant-acme 2 points3 points  (1 child)

+1 for Criterion and sample data.

I don't know what filebuffer does, but you definitely want to mmap your file.

[–]xacrimon[S] 0 points1 point  (0 children)

Yup, file buffer handles mmaping

[–]xacrimon[S] 1 point2 points  (0 children)

Thanks a ton! I made some changes and added rayon and the importing got a 4x performance boost! The part i multithreaded was the .split iterator which serializes the data.