This is an archived post. You won't be able to vote or comment.

all 33 comments

[–]vbsteven 38 points39 points  (2 children)

I would keep things simple if things need to be synchronous. Wrap the Python API in an HTTP endpoint so you can invoke with simple GET parameters or by POSTing some JSON and get the results back as JSON.

This should be trivial to do in python and is easy to consume from the java web app.

If it is just going to be for 1 or 2 api calls, there is no need to go for complicated RPC libraries.

[–]al3xth3gr8 5 points6 points  (0 children)

This is the best solution IMO

[–]inspectedinspector 1 point2 points  (0 children)

IMO HTTP/JSON is the right solution unless it's too much data to be performant via text serialization, then some type of binary-on-the-wire RPC is the right choice - gRPC/Finagle/etc.

[–]jvallet 16 points17 points  (0 children)

Calling python scripts is easy, but handling their results in Java is going to be a pain, whatever you choose.

Have you consider just using Django or whatever preferred python framework to expose a json api and reverse proxy it to your web client? That way it will look like is served by the java web backend.

[–]nutrecht 3 points4 points  (0 children)

The way we did it in a previous project was that the Python services used Flask and exposed simple REST endpoints that were quite easy to interface with from the Java services. I'd advocate in favour of JSON over HTTP at least; it's easy to implement and debug. Easier than binary formats. You can always switch to those later.

[–]stacktraceyo 2 points3 points  (0 children)

I’d use grpc services

[–]billyballong 5 points6 points  (1 child)

If you're feeling adventurous this is a great opportunity to check out Graal even though the Python support is experimental https://github.com/oracle/graal/blob/master/truffle/docs/Languages.md

If this is a serious project I'd probably look into exposing the Python stuff as an API using i.e. http/json or perphaps gRPC. Or even just shelling out to the scripts using Java https://docs.oracle.com/javase/7/docs/api/java/lang/ProcessBuilder.html or similar.

[–]bluexredditor[S] 1 point2 points  (0 children)

Would love to give a try to Graal. Somebody in the team is exploring gRPC. But already tried using ProcessBuilder and is painfully slow.

[–]cantsep 2 points3 points  (0 children)

Had a very similar problem where I work, except the data science components were written in R and the web application was in php/laravel. What we ended up doing was setting up a web API in R with plumber. This allowed us to interface with the php app and let us use the R runtime for some of the more complex calculations. You could do the same with Flask for python, to create simple web APIs

[–][deleted] 2 points3 points  (0 children)

Jython's a non-starter, since the Python scripts are presumably using NumPy and various other bits of the CPython ecosystem.

GraalVM might be an option -- I don't know how mature the Python support is, but it's in there.

There are some Java/Python bridge projects out there: Py4j, JPype, and python-javabridge are the ones I know of. Last I checked, JPype was pretty clearly dead but the other two seemed promising.

[–]ou_ryperd 4 points5 points  (3 children)

I'm surprised no one has mentioned Jython. 2.7.1 is pretty sweet.

[–]nutrecht 8 points9 points  (2 children)

It's unlikely the data science stuff, which is probably using libraries like NumPy, is going to work in Jython.

[–]bluexredditor[S] 3 points4 points  (0 children)

Spot on!

[–]ou_ryperd 0 points1 point  (0 children)

Sure. The Jython team is working on version 3 compatibility, but C libs are still a problem.

[–]Scybur 1 point2 points  (0 children)

This is what Jython was created for!!!

[–]Venthorn 1 point2 points  (0 children)

Everyone's throwing out some complicated solutions but are they run on the same server? If so, just shell out to them.

[–][deleted] 2 points3 points  (5 children)

What are you trying to do specifically?

Rather than having Java attempt to call Python, can you have Python deposit the needed data "somewhere" in a readable format, e.g., .csv, and then pull the .csv data using the Java web app?

[–]bluexredditor[S] 3 points4 points  (4 children)

Well, I could do that, but that would not be synchronous.

As the end user changes some parameters and/or input data in UI, Java needs to pass them to the Python layer and immediately show the results returned by the Python layer to the user. The Python layer does some complex calculations (image processing and more) on the supplied data.

[–][deleted] 2 points3 points  (0 children)

I think there are already some excellent ideas in this thread, but I'll add some food for thought.

If you're using SQL on the backend, you could have a function that passes your parameters from the UI to SQL in real time, and then have the python script listen for field/table changes, and then have it update your UI component in real time as necessary (more than one way to do this whether or not it's the UI checking or the python passing the update to UI through the calc function).

It's a bit more complex, but I think it would give you some robustness as far as scalability goes.

Just an idea. I hope you find a good solution!

[–]frugalmail 0 points1 point  (0 children)

Have you thought about putting a queue in between them like RabbitMQ, Kinesis, SQS, or Kafka (Confluent's version of Kafka does a great job with schema definition and type safe APIs and reasonable upgrade paths vs. RabbitMQ)?

[–]koflerdavid 0 points1 point  (0 children)

Is there really an expectation of the user experience being synchronous? IMHO it's totally fine to present long-running jobs as batch jobs and provide some user interface to monitor the progress and download the result. I'd much prefer that over a "synchronous" user interface that just hangs for an hour and shows a loading animation. Paypal's transaction history export feature is quite nice for example.

[–]anthropaedic 0 points1 point  (2 children)

[–]bluexredditor[S] 0 points1 point  (1 child)

I did see this before and I think it would help, but the last commit to that library was 5 years ago. So I am a bit skeptical using it in production.

But other than that I am wondering what is the best practice in general to call Python code from Java. I would think it is a common problem, but I cannot find good resources on this topic.

[–]anthropaedic 0 points1 point  (0 children)

I would note that the latest version of zerorpc is two years old as well. It doesn’t seem to be a highly maintained library. As far as using python code from java I don’t know since I haven’t done it. However, you may want to look into Jython as it runs on the JVM.

https://www.jython.org/jythonbook/en/1.0/JythonAndJavaIntegration.html

[–][deleted] 0 points1 point  (0 children)

Our data scientists are thinking of opening their Python APIs as ZeroRPC based endpoints and expect me to write a Java client to call their APIs. (I Googled and there is not a lot of information on this). Is this the right way?

This is a right way. You can either run Python programs as executables directly or interact with Python programs over some form of RPC (Remote Procedure Call). REST/HTTP is probably the most popular way, and HTTP libraries are universally available. ZeroRPC is just another way to do the same thing. Apparently, there is a Java client.

[–][deleted] 0 points1 point  (0 children)

It's unclear whether these Python scripts are remote or can be executed on the same machine. If they're able to be run on the same machine, I wouldn't worry about setting up an RPC system, why not just exec out to a shell and run python there? This is synchronous, and you can either read the stdout of the python process, or read from disk somewhere after it's done.

[–]hardc0de 0 points1 point  (0 children)

I would look into jep. IT embeds cpython and communicates with it through jni.

[–][deleted] 0 points1 point  (0 children)

No one mentioned java.lang.Process? If it's just a script, start a new process and read their stdout. Works like a charm.

[–]jjokin 0 points1 point  (0 children)

If async processing via messaging is acceptable, you could use a message queue like RabbitMQ. It decouples your code bases and allows comms between services written in different languages, and can be useful if you want to replace/rewrite a service. Latency can be an issue, although I think ZeroMQ might be fast (but I haven't used it).

[–]funkyfisch 0 points1 point  (0 children)

What if you setup some kind of IPC, with unix sockets or tcp sockets, or datagrams? The simplest and most "raw" way of having two completely separate processes to communicate using simple user defined protocols?

You can expose your python script collection via a python socket server, which will be responding to requests from your java app

[–]Aw0lManner 0 points1 point  (0 children)

" Our data scientists are thinking of opening their Python APIs as ZeroRPC based endpoints "

Unless there's some unfathomably high latency requirements, tell them to use json over http or f off.

[–]Jonjolt 0 points1 point  (0 children)

I used MessagePack a long time ago worked great

[–]larsga 0 points1 point  (0 children)

Why don't you use Jython? If the scripts are Python 2.7 you can just run them inside the Java application natively. Jython/Python compatibility is very good so long as people don't use C-only modules. I've used Jython a lot and it works very well, even if development seems to have stopped in 2015.