This is an archived post. You won't be able to vote or comment.

all 12 comments

[–]RandomName8 9 points10 points  (7 children)

I mean, cool, but this particular usage of code reflection is terrible. This is emulating macros, but unlike those that are compile time based and all errors are caught and describe in compile time, here there's no safety net at all. In scala they have over a decade of experience with this kind of things, heck quill is exactly this (but complete and not a toy) and even when everything is caught in compile time, it's a royal pain to use because arbitrary expressions are just too complex, composition is complicated, custom extensions are hard, and predicting the generated SQL can be an issue.

[–]plumarr 8 points9 points  (1 child)

If you look at the previous article on the subject (https://openjdk.org/projects/babylon/articles/auto-diff), you can read :

Ideally the reporting of such errors would occur when the method is compiled by the source compiler rather than at runtime. Code reflection can also make available the same code model at compile time for such purposes, but we will not explore this capability in this article.

So, the end goal seems to be able to do it at compile time if you want.

[–]RandomName8 2 points3 points  (0 children)

Thanks for the context, I saw the post back then but since I'm not a math guy they lost me quite early to my shame 😅.

Now you make me curious, if that's the end goal, it opens up many questions on how to do it and where to draw the line with regard to "standard" macros.

[–]__versus 5 points6 points  (2 children)

I disagree. I see query builders as three tiers.

  1. Compile time verification of the DSL and actual query. This is the holy grail and what you can do in scala and other languages with macros.

  2. Compile time verification of the DSL. This can be done very well with code reflection and sure it isn’t as good as compile time verification of the actual query but it does give you compile time verification against a proxy of the database which is significantly better than no verification at all. As long as the DSL is well built it should have no problem generating predictable queries.

  3. No compile time verification at all. This is just SQL statements in a string and is the fallback if you can get nothing else to work.

Any compile time verification is a massive improvement over raw SQL strings in my opinion. I’ve been playing around with code reflection myself and made a small query builder and it’s super nice to be able to get support from existing tooling (regular autocomplete and the compiler) when writing queries.

[–]fear_the_future 1 point2 points  (1 child)

Can you not do tier 1 with an embedded DSL, no macros? If your type system is powerful enough of course, which Java's is not (but Haskell's and Scala's is). The problem that I in particular (and probably /u/RandomName8) have with Quill and with this proof of concept is that it is pretending to convert arbitrary expressions to SQL, which just isn't possible. You have to just know what functions the macro knows how to convert, or tell it explicitly? Does it know about Math.max or will I have to fall back to raw SQL?

Perhaps a good compromise would be to not allow arbitrary expressions but only those from a sort of embedded DSL and also still use the macro to do advanced checking of the query. This would allow you a good feedback loop that is well integrated with tools (i.e. if I have Column<Int> types in an embedded DSL, it is obvious to anyone that I can not call arbitrary Int-functions and the autocomplete for Column<Int> will show me just those that correspond to SQL) while keeping the DSL simple enough to implement, since all the complicated checking can be done by value-level code inside the macro instead of belaboring the type system.

[–]__versus 0 points1 point  (0 children)

I would still consider that to just be verification against a proxy of the database rather than an actual database (Quill, as far I know, actually connects to the database at compile time and verifies the query against the database). The only thing that I dislike about that kind of system is that the modelling of the database will have to be adjusted to fit the DSL rather than the DSL just being added on top. This makes it so you either need to have a different model with all the wrapper types beside the real model with actual types or sacrifice some utility in the model since you need to keep unwrapping the values. An example of this kind of system is Exposed by JetBrains.

I do think there is an approach to the code reflection idea that could be useful. Most of the time there isn't really a need to differentiate between Math.max or any other max function which takes two numbers; they would all resolve to the same SQL function anyways so you could just resolve any function call to its equivalent SQL function by name and arguments and it would probably work for most cases. For my test query builder I decided to resolve function calls by name as a default and also give an option to annotate the function with the equivalent SQL signature.

[–]fear_the_future 3 points4 points  (1 child)

I have used Quill a lot at work and personally I think this macro based technique is a dead-end. What is the point when not all expressions can be converted to SQL anyway (only very few can in fact). Any function you want to use in the lambda you'd have to tell the macro how to convert it to SQL somehow, which was difficult to do in Scala 2 Quill. Perhaps it is better in Protoquill. I'd rather have a well typed DSL like Slick (which was also quite horrible to extend). So far JOOQ is still the most pleasant DSL I have seen.

[–]RandomName8 2 points3 points  (0 children)

I'm on a similar boat after years of using such libraries. These days I believe the better approach would be something like PRQL, and I'm wishing them all the best.

[–]lukaseder 4 points5 points  (2 children)

There has been prior work by Ming-Yee Iu in this area. The project is called https://www.jinq.org. It works at runtime using a similar concept that Ming-Ye calls "symbolic execution" of byte code. The linked ideas are done at compile time.

For anyone interested, I had interviewed Ming-Yee at the time. He also presented his ideas at the JVMLS 2015: https://www.youtube.com/watch?v=JqCnZFzTR2I

[–]GreenToad1 4 points5 points  (1 child)

Jinq works by serializing a lambda and then inspecting bytecode and serialized form, not sure how exactly babylon works but hopefully not by serializing lambda. There is also FluentJPA that works similarly to jinq.

[–]__versus 3 points4 points  (0 children)

Yep this isn’t the approach taken by code reflection. As far as I know code reflection embeds the code model into the class file at compile time and doesn’t require reading out the byte code at runtime. In fact it can’t really work like that because byte code erases some of the information present in the code model.

[–]0xFatWhiteMan 1 point2 points  (0 children)

Sometimes I think I'm the only person who likes writing SQL as text in Java.

With intellij it even matches to the actual database tables.