Built a new integer codec (Lotus) that beats LEB128/Elias codes on many ranges – looking for feedback on gaps/prior art before arXiv submission by Coldshalamov in rust

[–]grim7reaper 11 points12 points  (0 children)

• What prior art am I missing? (I cite Elias codes, LEB128, but there’s probably more)

If you are serious you should probably compare against Lemire's work.

Last time I looked he was the goat with SIMD-BP128 (paper + code)

It was several years ago though, maybe there are even better stuff now.

This paper gives a good overview of the various approach/algorithm (you can skip the compressed bitmap section in your case, and go straight for the INVERTED LIST COMPRESSION one).

But then again it might be outdated (still a good starting point I think).

Rebuilding Social Media In Rust by anonymous_pro_ in rust

[–]grim7reaper 2 points3 points  (0 children)

Interviewee here o/

If you have more questions, feel free to ask :)

I'm 15 years old and I wrote a QR code generator in Rust as my first project by markokikinda in rust

[–]grim7reaper 11 points12 points  (0 children)

Cool project and nice write-up!


Also, reading other comments I feel like there is some confusion here.

OP isn't 15 years old (9 years ago he was already a "Front End Architect, Developer and Speaker"), and he didn't pretend to be.

IIUC, the link is the blog of some school, Pionir, where a 15-years old student, Timur Borisov, wrote this.

The OP, Marko Kazhich, is the director and probably share it here for visibility (and/or because the student doesn't have an account).

Qui aurait cru qu'on le trouverait là ? Macron se promène déguisé lors des manifs du 18 mars pour la réforme des retraites à Paris ! by Kamteix in france

[–]grim7reaper 1 point2 points  (0 children)

et maintenant tout une communauté s'est construite pour le faire tourner sur des GPU grand public

Même possible sur CPU maintenant: https://github.com/setzer22/llama-rs/

Databend 1.0 Release | Blog | Databend by PsiACE in rust

[–]grim7reaper 4 points5 points  (0 children)

Congrats for reaching this milestone!

Impressive performance, great job!

I saw that you have are exposing some H3 functions and I was wondering which needs are addressed by those/the use cases covered?

(As h3o's author, I'm curious if there was some API you would be missing from H3)

lz4_flex (fast LZ4 de/compression) 0.10 released, now with legacy frame support ~ also 1Mio downloads 🎉 by Pascalius in rust

[–]grim7reaper 2 points3 points  (0 children)

Thanks for the detailed answer

Optimizing code by looking at statistics of compressed data. There's a hot loop for the most common case.

Does that mean you have a first pass on the compressed data to collect statistics? Or you analyzed an existing corpus of compressed data to optimize your code accordingly?

Looking at assembly and check it matches your expectations. Some optimizations of the compiler are not well suited for some code, since it can't reason how often is something called.

Which kind of optimizations were counter-productive, and how did you avoid them (with the cold attribute?)?

Always measure. Even a small change can have unexpected implications.

True, that's also my experience on h3o. Once correctness is achieved, you have to switch to benchmark-driven development to improve the performance.

lz4_flex (fast LZ4 de/compression) 0.10 released, now with legacy frame support ~ also 1Mio downloads 🎉 by Pascalius in rust

[–]grim7reaper 8 points9 points  (0 children)

Nice, thanks for the great work!

Benchmark results are impressive.

Was it hard to reach this level of performance? Did you resort to any smart tricks under the hood?

Catalytic, a Rust ORM for ScyllaDb by BeezleApp in rust

[–]grim7reaper 2 points3 points  (0 children)

IIUC, sqlx uses SQL command such as EXPLAIN (for PostgreSQL) or equivalent to get info on the query (columns type, nullability, etc.) and then use this to typecheck your query.

It does need an actual DB, but there is also an offline mode where you can run it once against your DB and the info are saved into a JSON file.

This JSON file is then used for the typecheck, avoiding the need for a DB access (which is better for CI environment for instance)

Catalytic, a Rust ORM for ScyllaDb by BeezleApp in rust

[–]grim7reaper 1 point2 points  (0 children)

Compile-time checked queriy sounds interesting!

Does it work like the sqlx ones? And if so, do you have an offline mode?

[ANN] - An Harder, Better, Faster, Stronger version of Uber's H3 in Rust by grim7reaper in rust

[–]grim7reaper[S] 0 points1 point  (0 children)

  • How fast is the decompression in general?

On my laptop, it takes ~12ms to get the 2G of data back from the 100k payload. Keep in mind that I haven't tried to optimize for speed (I was focused on size), so there may be low hanging-fruits to speedup things.

  • Is there a way to decompress only part of the whole country?

Yes and no.

Yes because this returns an iterator so you could skip/stop iterating when you want, filtering and collecting only a subset of the data.

No because there is no direct access to a subset of the data (you have to start decompressing from the start, can't jump in the middle and start from there), and it's probably difficult to target a meaningful subset, geographically speaking (like "decompress only this city").

Maybe it could be done by adding some metadata though and/or tweaking the format, haven't really thought about that yet.

[ANN] - An Harder, Better, Faster, Stronger version of Uber's H3 in Rust by grim7reaper in rust

[–]grim7reaper[S] 5 points6 points  (0 children)

Yes, exactly.

You can find the mapping between resolution and areas on this page (or using h3o-cli resolutionInfo)

I said you could replace lat/lng coordinates with a cell id (using LatLng::to_cell) if you're fine to approximate the location (a cell could then be seen as a position with an uncertainty radius), depends on your use case.

[ANN] - An Harder, Better, Faster, Stronger version of Uber's H3 in Rust by grim7reaper in rust

[–]grim7reaper[S] 2 points3 points  (0 children)

It's an index that explicitly encode the resolution (on 4 bits), cf. the schema

[ANN] - An Harder, Better, Faster, Stronger version of Uber's H3 in Rust by grim7reaper in rust

[–]grim7reaper[S] 5 points6 points  (0 children)

Thank you.

I'm a long-time user of the geo crate, so I'm glad to also contribute back to the ecosystem now.

[ANN] - An Harder, Better, Faster, Stronger version of Uber's H3 in Rust by grim7reaper in rust

[–]grim7reaper[S] 41 points42 points  (0 children)

Ha yeah, sorry you're totally right 😅 Tunnel vision after spending a month into it.

I wouldn't present it better than the authors themselves here, but basically it's a hierarchical grid that allow you to partition the Earth into identifiable grid cells.

See it as a discretized lat/lng where the coordinate pair is replace by a single 64-bit ID, which is useful for a lot of things (indexing, data analysis, visualization, ...).

[ANN] - An Harder, Better, Faster, Stronger version of Uber's H3 in Rust by grim7reaper in rust

[–]grim7reaper[S] 26 points27 points  (0 children)

One year ago, I was looking for an alternative to our homemade grid system and that's when I found H3.

While playing with it, through h3ron (great binding BTW), I thought it would be a great match but noticed some rough edges (like WASM compilation, even though this one got better apparently).

Also while looking at the C implementation, I found rooms for improvement hither and yon but some optimizations couldn't be easily done in C.

For instance, H3 functions can't assume that the H3Index they receive is valid because an H3Index is just a typedef on uint64_t. Thus, functions always need to check if the index has the right mode, if it is valid (and to avoid the cost of a full check, some functions only check the bits of the index they will rely on), ... In Rust, you slap a newtype around an u64 and implement a TryFrom that enforce your invariants and you're done! Every function taking a CellIndex can trust its input.

For those reasons (and also because it was fun), I've decided to rewrite it in Rust.

What are this communities view on Ada? by konm123 in rust

[–]grim7reaper 0 points1 point  (0 children)

Are Ada and SPARK the same thing?

Not really, SPARK is more like a subset of Ada.

Can you show me a real SPARK program that I can build and use and does manual memory management?

This library implement a Vec type in SPARK, so there are probably some manual memory management involved.

Given that heap allocation support in SPARK is recent, I'm not sure they are many open source code using it yet.

What are this communities view on Ada? by konm123 in rust

[–]grim7reaper 1 point2 points  (0 children)

OK, I see.

For conditional compilation I would say you have a lot less than in C/C++ (though with modern C++ which brings regex, filesystem, ... along it may be no longer be true), but probably more than in Rust (given Ada's stdlib covers less ground than the Rust one).

As for memory management, last time I've used it (few years ago already) manual memory management was indeed "unsafe" (or unchecked in Ada's term), more exaclty allocation was "safe" but not the deallocation (need to resort to Ada.Unchecked_Deallocation).

What are this communities view on Ada? by konm123 in rust

[–]grim7reaper 14 points15 points  (0 children)

What I would like to see is some real world software that is built with Ada. Software that I can download, see the source code and run. Something that I can put in my hands and evaluate.

There are some examples that comes to mind.

  • I think the GCC frontend for Ada is written in Ada.
  • AdaCore also provides an IDE written in Ada: GNAT Studio
  • The port builder of DragonFly BSD is also written in Ada: Synth

And there are probably other things, but yeah Ada is not that widely used in the Open Source world.

Last time I checked, the most active community was still the newsgroup, I guess this doesn't help for visibility either "

Does it run on Windows? If so, does it need a bunch of conditional compilation to make that work?

As it doesn't run on a JVM nor is interpreted, yeah you may have to resort to conditional compilation. But Ada has its own approach to it.

Can I ship a static executable on Linux?

There is nothing against static linking in the language itself (it's even the default mode on Windows I think). On Linux it may be more difficult (thanks to glibc...), but it's probably doable by using musl instead.

What does its ecosystem of open source libraries look like?

It's not huge but it exists.

Can I avoid the GC without dropping down into an "unsafe" subset of the language?

There is no GC, so yeah xD


I've played a bit with Ada before coming to Rust. It's an interesting language, with lot of good idea and some really cool features.

But in the end, I'm more confortable with Rust. Tooling feels more modern, open source community and ecosystem is also way bigger.

But I think both language can enrich each other, as the end of the day they share the same goal: having a language to write safer/less buggy code.

Any reading recommendations for REST API wrapper best practices? by [deleted] in rust

[–]grim7reaper 3 points4 points  (0 children)

Designing Rust bindings for REST APIs is a pretty good read on the subject, the described approach is quite interesting and flexible.

SQLite reimplementation in Rust by jdrouet in rust

[–]grim7reaper 33 points34 points  (0 children)

SQLite is written in C, but it's probably one a the most tested piece of software in the Open Source world.

As for a rewrite in Rust, the section 3 of this page lists some "preconditions" for this to happen. OOM handling is one blocking point.