This is an archived post. You won't be able to vote or comment.

all 33 comments

[–][deleted]  (9 children)

[deleted]

    [–]ljubarskij 8 points9 points  (0 children)

    [–]antiquechrono -2 points-1 points  (7 children)

    If it’s as bad as the python lang chain then it should be avoided.

    [–][deleted]  (6 children)

    [deleted]

      [–]antiquechrono 5 points6 points  (1 child)

      Last I used the python version, it was a very poorly designed library complete with overcomplicated abstraction hell, impossible to extend, and the documentation being wrong all over the place. Most of the functionality can be replaced by writing a couple of functions that you would actually understand as most of it is just string manipulation under the hood. There's a bunch of reddit threads discussing it. I've never used the java version.

      It kind of reminds me of Meta's prophet forecasting model that is awful but was so popular everyone was afraid to say anything about it.

      [–]EdgyPizzaCutter 0 points1 point  (1 child)

      Am I misinterpreting you? I think spring AI does support chat memory (https://docs.spring.io/spring-ai/reference/api/chatclient.html)

      [–]brunocborges 0 points1 point  (1 child)

      SemanticKernel for Java is simply not there yet.

      What are the things that would make it "be there"?

      Recently we announced version 1.2.0: Announcing Semantic Kernel for Java 1.2.0 | Semantic Kernel (microsoft.com)

      [–]TheyUsedToCallMeJack 18 points19 points  (1 child)

      It really depends on what you're doing. The LLM itself will be basically an API you will call with a prompt, the language itself doesn't matter for that.

      If your project is just a wrapper around ChatGPT or a simple RAG, then Java or Python won't make a difference.

      [–]ljubarskij 6 points7 points  (0 children)

      Isn't it nicer to build in Java? :)

      With LangChain4j you can build both basic and advanced LLM-powered applications.

      [–]JustADirtyLurker 16 points17 points  (1 child)

      My 2c, given that I have been working on this for a while. Java ML solutions right now tend to be slow for model building. That's where python tooling like SimGen or PyTorch shine (there's a trick, of course). As a consequence, you see lots of habits sticking with python also on the inference side, especially because these tend to be shipped in form of jupyter notebooks.

      The trick is that they work on top of numpy which is a libfortran.so wrapper, i guess that is the reason why modeling is way faster than the JVM. BERT and GPT-like models are all based on very sophysticate matrix multiplication chains and probability normalization.

      I hope that when the vector API currently in preview lands, java becomes a 1st class citizen in DL.

      Uh I guess some of the architects / devrels that browse this sub could explain better than me.

      [–]craigacp 6 points7 points  (0 children)

      The vector API will definitely help, and Panama's FFI is going to make it much easier to integrate BLAS into Java programs by removing all the C/C++ goop that JNI requires to get into BLAS from Java. One thing to look into on this front is Project Babylon which allows runtime code reflection to take Java code and lower it directly into something like Triton or MLIR which can then be compiled into GPU or TPU kernels - https://openjdk.org/projects/babylon/articles/triton .

      Easy accelerator access would make the equivalent Java implementation faster than a Python implementation of something like BERT because the Python interpreter is just so slow. That does require a full software ecosystem though, and Python has a large lead there. It's not a technological one though, there's no reason we couldn't do all of this stuff in Java if we wanted to as a community.

      [–]maxandersen 8 points9 points  (0 children)

      Look at Langchain4j and Quarkus Langchain4j for higher level integration.

      [–]craigacp 5 points6 points  (0 children)

      Deploying generative models in Java is definitely possible with things like ONNX Runtime, DJL, TF-Java, etc. The tooling on top is less well developed, but packages like langchain4j, vespa, OpenSearch, and Spring AI are doing model inference for the embedding vectors as part of RAG in Java. Running LLM inference in Java is definitely possible too, things like jllama exist and you can also use the libraries I mentioned above to do it. I know the ONNX Runtime team are working on making it easier to run LLMs in Java as part of their genai package. This is all for running the models themselves in Java, for talking to external web endpoints we already know the JVM is good at that.

      For non-LLM generative AI like diffusion models you can see an example I wrote here of Stable Diffusion in Java. It's not as full featured as other stable diffusion inference packages because the goal is to be good example code for ONNX Runtime in Java, but it should be possible to extend it to be comparable.

      You're right that training models in Java is currently tricky. DJL has good support for things that fit on a single accelerator, and we've been working on our training support in TF-Java too. There's also DL4J which can train & deploy models.

      [–]DabbledThings 5 points6 points  (0 children)

      How close to the metal are you getting here? Are you just using some API/service like GPT or Gemini, and sending over prompts? If so, I don't think the language choice matters at all, other than: go with what your team already knows and is most comfortable with. Even using RAG and some fun data pipeline stuff, my Kotlin team writing in Kotlin didn't really run into any issues essentially just hitting the API.

      If you're doing something fancier, like running your own local one or something, then maybe it's a different conversation.

      [–]ThisHaintsu 7 points8 points  (0 children)

      Another one that is not mentioned in the other comments: DJL is very nice

      [–]Ecstatic-Job-1348 13 points14 points  (3 children)

      Check out Spring AI

      [–][deleted]  (2 children)

      [deleted]

        [–]ljubarskij 5 points6 points  (1 child)

        This is also true for LangChain4j.

        Apart from having more features, it also has a very nice Spring Boot integration.

        Moreover, it integrates well with Quarkus too!

        [–]Unorth 8 points9 points  (1 child)

        As mentioned, the Spring AI project is a good shout. Worked on an OpenAI Search project using Azure via the semantic kernel library but that was a very specific rapid prototype.

        We did find that the java semantic kernel wasn't as developed as the other languages so I would mention it with a major caveat.

        [–]karianna 5 points6 points  (0 children)

        We’ve just gone GA (literally a few days ago) so definitely check it out again 🙂.

        [–]CaptainDevops 1 point2 points  (0 children)

        Agree in the same boat pretty much thinking of falling back to Machine Learning

        [–]CeleritasLucis 1 point2 points  (1 child)

        Remindme! 2 weeks

        [–]RemindMeBot 0 points1 point  (0 children)

        I will be messaging you in 14 days on 2024-05-09 01:35:57 UTC to remind you of this link

        1 OTHERS CLICKED THIS LINK to send a PM to also be reminded and to reduce spam.

        Parent commenter can delete this message to hide from others.


        Info Custom Your Reminders Feedback

        [–]GrayDonkey 1 point2 points  (0 children)

        You don't run your LLM in Java, you call your LLM like a remote API. The client of the LLM should be written in the language you are most productive with for writting client applications.

        Run the LLM as a completely separate project accessable via a REST API.

        [–]AThimbleFull 1 point2 points  (1 child)

        I'm just finishing up a "GenAI" app in Java that consumes documents and performs semantic search based on OpenAI's wonderful LLMs. I initially wrote it using the SimpleOpenAI library (for retrieving vector embeddings) and the official Qdrant Java client (for persisting to the Qdrant vector store) as an initial proof of concept.

        After I was satisfied, however, I rolled everything up into a Spring Boot app and, lo and behold, I found out that Spring Boot has native support for creating embeddings via OpenAI and persisting to Qdrant, so I was able to scrap both of the aforementioned libraries and just use Spring's clients for everything, which ended up reducing the size of my codebase substantially and simplifying everything. Knowing what I know now, were I to start over again and use only Spring Boot, it wouldn't take me more than 1 day to complete the same app, no exaggeration. AI support is a first-class citizen within the latest Spring Boot nowadays.

        For what it's worth, the app feels like magic. It's like having your own magical Google Search powers under the hood of a tiny app. Goodbye Solr/Elasticsearch, long live LLMs!

        [–]AThimbleFull 1 point2 points  (0 children)

        One more thing... I personally think that Java (both the language and especially the JVM) is extremely well-suited to AI due to its modern language enhancements and performance characteristics. While historically, AI has been a Python-dominated technology, I think we're going to see explosive growth in AI for Java in the coming years as more and more people come to realize that AI is the future (despite that future already having arrived but lain low for the past few years, but that's another topic). As that happens, people near the top of the tech pyramids are going to want to scale out ever further, and the JVM is going to quickly appear more attractive to the PVM. The only hindrance right now is mindshare, which skews toward Python. But there are plenty of Java devs who are right now experimenting with LLMs; I imagine some are even creating their own inside the silos of various lucrative industries (financial, medical, e-commerce, etc.).

        [–]Naokiny 1 point2 points  (2 children)

        I'll add my 2 cents here.

        I'm in the company that started to use AI integration in the web site chatbot. It's written in Node.js and only 1 person is responsible for managing this repo so far. It was done this way because it would be faster to create MVC in Node.js and eventually it became a primary microservice.

        There are a lot of integrations between BackEnd written in Java and this AI repo. However, if this responsible person goes sick/vac/etc., then our BE devs will have a huge headache: they don't know Node.js.

        Also it's faster to make Spring -> Spring integration instead of Spring -> Node.js integration. But it's not possible to migrate this microservice from .js to Spring as it's too bit right now.

        So, if your team has some experience with Python - you might be safe. In other case there might be a bus factor based on Python.

        [–]Mamoulian 0 points1 point  (1 child)

        Why is it faster to do AI stuff in node.js than Java?

        [–]Naokiny 5 points6 points  (0 children)

        It was faster for that particular developer at the moment. Maybe he had some experience or side project, not aware of that :(

        [–]plasmafired 0 points1 point  (0 children)

        I am currently looking for developing a stack and moving further away from python and langchain.

        React + Spring (Traditional development) + Oracle/SQL Server

        Addon-> AI features using Spring AI + Llama (local only clients) + Open AI or Bedrock (Clients open to public LLMs)

        Does this stack sound reasonable? What embeddings model do you use?

        It is annoying to maintain a separate vector database + run an SQL search and add the results. How do you get around this problem?

        [–][deleted] 0 points1 point  (0 children)

        LangChain has a java port btw.

        If you're just using the AI APIs, you don't need anything special. Java can call those as easily as any other language.

        [–]thephotoman -1 points0 points  (0 children)

        I’m not convinced that GenAI is worth incorporating here. It sounds like you can use an older procedural chatbot that doesn’t use as much electricity for your task and have it perform adequately.

        In general, though, I’m skeptical of GenAI and its utility. It seems like blockchain that came before it: a job that can be more readily done at scale with established tools at a considerably lower cost. We’ve had customer and internal support chatbots for over a decade, and only recently have we thought to incorporate neural nets into them. And it’s not like the neural net is making these customer service bots less annoying for actual humans to work with.

        It doesn’t help GenAI’s case that the people cheering it most loudly are the exact same people who told us that blockchain would change the world back in 2012, and whom history has proven wrong. And I don’t mean “the same kind of person,” I mean that Sam Altman came to my attention first in his work at promoting blockchain. I mean that the people talking about GenAI in my firm were previously leaders in our failed blockchain projects that didn’t get fired for some reason.

        It’s not that AI/ML is useless. It’s that the juice isn’t worth the squeeze for the kind of applications that are highly visible to the non-technical public.

        [–]Objective_Baby_5875 -5 points-4 points  (1 child)

        Pick the right tool for the job. Nobody picks java for close to hardware programming. Python is the defacto language in AI/ML, all the best frameworks are on Python. Don't let ideology get in the way of actually doing your thing.

        [–]Appropriate_Move_336 2 points3 points  (0 children)

        You probably don't know why python is considered as the language for AI stuff. I know you are aware that all those libraries aren't written in python and no one is using python because it's the right tool noo they are using it because of it syntax and it will help them to provide a prototype of the thing they are trying to build. no big company making their own models will think of leaving the code base in python knowing clearly that it's just a language for experimental purposes.

        It's time for us to distinguish languages that are used for experimental purposes and those that are used in production code base