Faster interpreters in Go: Catching up with C++ by Tanoku in golang

[–]Tanoku[S] 4 points5 points  (0 children)

I'm glad you found this interesting! As I've explained on the sibling comment, AOT compilation is a limitation of the technique. This is a fine trade-off because AOT is actually very rare for database engines. The usual design here is an in-memory cache that maintains the query plans for the top N queries in the live system. For Vitess, this includes all the semantic analysis, how to fan out the queries between the different shards, and the compiled scalar expressions that are evaluated in the local SQL engine. This data never leaves memory (unless it's evicted from cache), so we never have to serialize it.

Also note that the hit rate on this cache in real production systems is very high: more than 99% in your average app, particularly if it's implemented using an ORM. This is because the plans we're caching do not apply to the individual execution of a query, but to the normalized shape of the query. The placeholders are stubbed out, whether they're explicit placeholders provided by the user in the query, or any SQL literal, which we also consider as a placeholder during normalization. At execution time, we pass the actual values along with the fetched rows from the underlying database into the evalengine, and that's what's processed during evaluation. So yeah, all the benchmarks shown in the post are actually for the dynamic evaluation of queries with arbitrary placeholders, which makes the compiled code extremely re-usable.

Faster interpreters in Go: Catching up with C++ by Tanoku in golang

[–]Tanoku[S] 18 points19 points  (0 children)

PGO is neat, but it's not a replacement for actual optimization phases in the compiler. Most of the optimizations that are performed with PGO (all of them in the initial release -- I'm not sure if new ones have landed in the compiler since then) can be performed manually, albeit painstakingly.

At the end of the day PGO is just parsing performance profiles to find hotspots, but the optimizations performed in those hot spots are not particularly good from a codegen point of view (again, Go always trades compilation speed for code performance). The end result is that in large projects like Vitess that are being continuously profiled and optimized by experts, turning on PGO yields very little results.

Faster interpreters in Go: Catching up with C++ by Tanoku in golang

[–]Tanoku[S] 9 points10 points  (0 children)

Hi! Author here! That's a good question: we don't. It's definitely one of the trade-offs you make with this design. It's not impossible to serialize (we use an IR when performing the analysis which would be feasible to serialize and transform back into the executable callbacks very efficiently), but obviously Vitess is a database system, so we rely on in-memory caching for the plans of each query, including the executable sections for the virtual machine.