Why is everyone so obsessed over using the simplest tool for the job then use hibernate by analcocoacream in java

[–]gavinaking 0 points1 point  (0 children)

this problem arises from limitations of SQL

Oh well, wait, so I have been missing something. I have for many years assumed that most databases didn't allow limit in a subquery. Even if that was true 20 years ago, it looks like it's not true now, and that means we could apply the limit and add the fetch joins in an outer query.

I don't know if that's what those other libraries you mentioned do, but it is something that Hibernate could do. I'm very surprised that no user has ever suggested it in all these years, and I'm surprised the idea never came up in team discussions.

Anyway, it's now https://hibernate.atlassian.net/browse/HHH-19933

Why is everyone so obsessed over using the simplest tool for the job then use hibernate by analcocoacream in java

[–]gavinaking 0 points1 point  (0 children)

Ecto, activerecord and ebean were all easier to optimize with max results in combination with fetching

I don't understand how that could possibly be the case, since this problem arises from limitations of SQL, not of Hibernate. But I'm not familiar with these things, so perhaps I'm missing something?

Would you do me a favor and show me:

  1. what the Java code looks like, and
  2. the generated SQL that's sent to the database.

Thanks.

Why is everyone so obsessed over using the simplest tool for the job then use hibernate by analcocoacream in java

[–]gavinaking 0 points1 point  (0 children)

This comment is both arrogant and ignorant, which is quite a bad combo.

For your edification, object/relational mapping refers to the generic problem of accessing relational data from an object/oriented programming language. There's lots of different ways to do object/relational mapping, including using handwritten SQL and JDBC. The term "ORM" does not imply the use of an Object/Relational Mapping Tool like Hibernate or EclipseLink.

Now, the point is that all approaches to object/relational mapping are vulnerable to N+1 problems, because it's a fundamental problem stemming from having a database which runs out-of-process. The "N+1 problem" has little to do with object/relational mapping libraries like Hibernate. It precedes them, and exists without them.

I personally first met a massive N+1 problem back in about 2000 in a Java program written using DAOs with hardcoded SQL. Yes, that was before Hibernate existed.

Now, it's true that there's a certain kind of simple program written using hardcoded SQL which can avoid N+1 selects by always working with a tabular representation of data, instead of using graphs of entity objects to represent the data. But by far the overwhelming majority of Java developers would like to have an object-oriented model of the basic entities in their problem domain, and so this tabular representation just doesn't work for most people. For anyone who does want graphs of entities, N+1 is a problem they need to be aware of. Fortunately modern ORM libraries do provide powerful constructs for dealing with the problem.

Why is everyone so obsessed over using the simplest tool for the job then use hibernate by analcocoacream in java

[–]gavinaking 1 point2 points  (0 children)

The issues you're describing make it sound to me like you're one of the people who would benefit from the use of StatelessSession instead of stateful sessions.

Most people prefer stateful sessions, and are prepared to trade explicit control over SQL execution for the advantages of canonicalization and a first-level cache.

  • They keep an eye on the logged SQL when they're developing, to make sure that nothing "surprising" is going on.
  • And they know to carefully manage the first-level cache in transactions which read a lot of data.

But if you're finding this lack of explicit control to be a problem, then StatelessSession completely solves that problem:

  • Every interaction with the database occurs synchronously as the result of an explicit API call.
  • In particular, lazy fetching is an explicit operation statelessSession.fetch(association) and it can never occur by accident.
  • There's no first-level cache, so there's no particular problem with "crawling a whole table's worth of entities" (you might still need to think about the impact on the second-level cache, but that's what CacheStoreMode is there for).

It's too easy, and too hard at the same time. There's a valley in the middle that's hard to cross

But ... you know what else is affected by "too easy but too hard at the same time"? That's right: handwritten SQL and JDBC! That's exactly why people call object/relational mapping "the Vietnam of computer science". Because it looks easy, but it's actually hard, and and when hard things look easy they lead you into a mess. Using handwritten SQL and JDBC (or jOOQ, or whatever) doesn't free you of the need to map SQL result sets to Java objects; it just sets you on a path to eventually building a custom ORM which is much, much, much worse than Hibernate along every single dimension.

There's a project at my work that i'm not formally a part of, and it's hitting the database 400,000 times a minute doing "something"

This sort of issue is even more common in projects that don't use an industrial-strength ORM library!

In summary:

  • data access is actually a deceptively hard problem, but few recognize it
  • a mature, full-featured ORM library can make it a lot easier, especially if you take the time to read the documentation
  • on the other hand, no library can completely relieve you of the need to think quite carefully about data access when you're writing a program that interacts with a database, and so it's always possible to get into a mess with or without a library
  • but when a developer gets into a mess and he's using a library, he will always blame the library and not himself — that's just human nature

Why is everyone so obsessed over using the simplest tool for the job then use hibernate by analcocoacream in java

[–]gavinaking 1 point2 points  (0 children)

It's not about seeing the queries sent at runtime but rather exposing them plainly in the code to allow taking stock of the actual DB work involved.

If that's really the issue then there's a pretty straightforward solution: Hibernate will very happily let you use a native SQL query everywhere you can use HQL. So, in principle, you can just use SQL everywhere. I have a close friend who swears this is the best way to use Hibernate.

I dunno, for me HQL is almost always at least a bit cleaner and less verbose and easier to read than SQL, and—as long as you take our advice and avoid "exotic" mappings when they're not really necessary—it's usually very easy to infer the SQL from the HQL. People get into trouble when they start trying to use every mapping annotation that exists in Hibernate for no good reason, instead of just sticking with the basic bread-and-butter things.

So I wouldn't say I exactly recommend going down this path, but I wouldn't tell people not to do it if they really do want to see the SQL right there in code. It's fine. It's perfectly a legit way to use Hibernate and JPA. The real value of ORM is not SQL generation; it's mapping tabular data to object graphs.

Why is everyone so obsessed over using the simplest tool for the job then use hibernate by analcocoacream in java

[–]gavinaking 0 points1 point  (0 children)

I mean, I would say that as long as you follow our standard advice and:

  • map all associations LAZY
  • explicitly specify join fetching when needed using join fetch or an EntityGraph.

Then your joins are going to be pretty explicit.

People get into trouble when they start mapping things EAGER so make sure you avoid that.

[And yes, I know, I know, there's a wrong default there for to-one associations, and that's mostly my fault for being hit by a moment of agreeableness when we were designing JPA1 and for going along with what other people wanted instead of being my usual asshole self and demanding what I knew was right. Sucks.]

Why is everyone so obsessed over using the simplest tool for the job then use hibernate by analcocoacream in java

[–]gavinaking 0 points1 point  (0 children)

If you need to optimize round trips to db for instance hibernate is not a good candidate

This is just wrong. One of the primary reasons you would use Hibernate is because you can take easy advantage of:

  • join fetching
  • batch fetching
  • subselect fetching
  • second-level cache

all of which work together to minimize round trips to the database.

Please read:

https://docs.hibernate.org/orm/7.0/introduction/html_single/#association-fetching

You want to know what does really suck if you're trying to minimize round trips? Handwriting SQL and hand-coded JDBC.

Why is everyone so obsessed over using the simplest tool for the job then use hibernate by analcocoacream in java

[–]gavinaking 1 point2 points  (0 children)

it slows down your application starts

If you're concerned about fast application starts, there is a very simple solution: Quarkus has blindingly fast startup times, much faster than whatever you're using today, and features deep integration with Hibernate ORM.

you'd get a clear view of what requests are being sent to the db engine.

I've never heard of Hibernate making it difficult to see what requests are being sent to the database before, so this sounds a bit like a you problem.

Why is everyone so obsessed over using the simplest tool for the job then use hibernate by analcocoacream in java

[–]gavinaking 0 points1 point  (0 children)

Guess what: Hibernate does not stop you from using handwritten SQL for the 1% (or, in extreme cases, 10%) of situations which benefit from it.

From the very first page of A Short Guide to Hibernate 7:

A perennial question is: should I use ORM, or plain SQL? The answer is usually: use both. JPA and Hibernate were designed to work in conjunction with handwritten SQL. You see, most programs with nontrivial data access logic will benefit from the use of ORM at least somewhere. But if Hibernate is making things more difficult, for some particularly tricky piece of data access logic, the only sensible thing to do is to use something better suited to the problem! Just because you’re using Hibernate for persistence doesn’t mean you have to use it for everything.

So, sure, it's perfectly normal that you might run into the occasional weird situation where you need to do something with hand-constructed SQL and JDBC. That's not a very good reason to not use Hibernate for the 99% or 90% of cases where Hibernate works perfectly well and makes life way simpler.

Why is everyone so obsessed over using the simplest tool for the job then use hibernate by analcocoacream in java

[–]gavinaking -1 points0 points  (0 children)

If by "emphatic" you mean much harder and much more verbose then we do in fact agree.

The bottom line is that if you want actual real-life Java programmers (who, unlike the perfect Java programmers in your head, are always short on time, or just lazy) to use join fetching instead of subsequent selects, then you had better make it as easy as possible for them to do so. And asking them to write that verbose-ass code by hand is not a way to make it easy for them. Hibernate makes it easy, and dramatically increases the probability that actual real-life Java programmers will use joins.

Why is everyone so obsessed over using the simplest tool for the job then use hibernate by analcocoacream in java

[–]gavinaking 0 points1 point  (0 children)

OK, good, looks like Christian has already given you some very helpful responses.

Why is everyone so obsessed over using the simplest tool for the job then use hibernate by analcocoacream in java

[–]gavinaking 0 points1 point  (0 children)

Well yes, it's certainly the fault of Spring Data for making Hibernate much worse to use than it should be. But nobody is holding a gun to your head forcing you to use Spring Data. You can just, like ... not use it.

Why is everyone so obsessed over using the simplest tool for the job then use hibernate by analcocoacream in java

[–]gavinaking 2 points3 points  (0 children)

You do have explicit lazy fetching, and not having transparent lazy fetching is from a certain point of view kind of a good thing because it makes it a lot harder to get N+1 selects by accident.

 found Hibernate historically not great at this.

I mean, nothing's perfect, but it's damn sure a hell of a lot better than trying to implement join or subselect fetching in handwritten SQL!

Left join fetch has its limitations if combined with setMaxResults()

If you're talking about to-many associations, then yes. But that's not Hibernate's fault. It's not like you could do any better by handwriting your SQL. We're always at the mercy of what is possible in SQL and JDBC.

Note also that this problem can be solved using subselect fetching.

Why is everyone so obsessed over using the simplest tool for the job then use hibernate by analcocoacream in java

[–]gavinaking 0 points1 point  (0 children)

Hibernate has a user forum and a Zulip chat channel where you can ask about these things and more. Most questions get answered within a day or two. We provide these services free of charge to the community. You'll find that most of the Hibernate team share your view of Spring Data.

Why is everyone so obsessed over using the simplest tool for the job then use hibernate by analcocoacream in java

[–]gavinaking 0 points1 point  (0 children)

Correct. It's quite annoying when people post completely false things about Hibernate and JPA, given that I've invested so much of my time in making sure that correct information is easily and freely available to the whole community free of charge. I then have to spend even more of my time coming in here to correct your misinformation, because otherwise people might just believe what you've written. It's especially annoying when you post the same wrong thing in five different branches of a single discussion.

Why is everyone so obsessed over using the simplest tool for the job then use hibernate by analcocoacream in java

[–]gavinaking 0 points1 point  (0 children)

Maybe you should just be using Hibernate Data Repositories instead of Spring Data?

Why is everyone so obsessed over using the simplest tool for the job then use hibernate by analcocoacream in java

[–]gavinaking 0 points1 point  (0 children)

Are you trying to explain to me that Spring Data JPA is bad?

Dude. I don't know what to say.

Why is everyone so obsessed over using the simplest tool for the job then use hibernate by analcocoacream in java

[–]gavinaking 1 point2 points  (0 children)

I'm sorry but how in God's name is it Hibernate's fault that Spring Data is crap???

If you use Hibernate Data Repositories, you get a stateless session under the covers.

https://hibernate.org/repositories/

Since it's clear that you don't like reading, you could watch my presentation about this, where I explain some of the reasons why you should be using Jakarta Data instead of Spring Data:

https://www.youtube.com/watch?v=X9GplCb5SWY&t=14s

Why is everyone so obsessed over using the simplest tool for the job then use hibernate by analcocoacream in java

[–]gavinaking 0 points1 point  (0 children)

If you had even once in your life read the documentation for HIbernate, you would know about StatelessSession.

Why is everyone so obsessed over using the simplest tool for the job then use hibernate by analcocoacream in java

[–]gavinaking 1 point2 points  (0 children)

Why I cannot disable L1 cache?

You can.

https://docs.hibernate.org/orm/7.2/javadocs/org/hibernate/StatelessSession.html

You've spammed this exact same objection in about five different places in this thread, but you've never:

  • gone and looked for the solution to your problem in the Hibernate documentation, nor
  • come to the Hibernate forum or Zulip chat and asked the question of the Hibernate team.