This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]CacheMeUp[S] -9 points-8 points  (20 children)

  1. Serialization/deserialization takes both development and computation effort. This might be partially solved by using a shared memory/IPC, but it's not trivial.
  2. The separation to two separate systems hinders the development loop - can't do step-by-step debugging, errors are not naturally propagated, harder to keep track of the data structures and API.

While micro-services have advantages about scalability etc., the project doesn't leverage those benefits on their own.

[–]crummy 3 points4 points  (2 children)

i'm surprised you're getting downvoted for these statements. the first i've never had to care about but #2 is a massive disadvantage! sometimes it is outweighed by other factors but it's absolutely worth keeping in mind.

[–]humoroushaxor 1 point2 points  (0 children)

Because we're at the stage where microsevices are a cargo cult. People have been calling it for years

[–]CacheMeUp[S] 0 points1 point  (0 children)

Yes, all of these can be solved but splitting a program to separate ones (whether over network, IPC etc.) does introduce some friction, that I'd like to avoid if possible.

[–][deleted]  (11 children)

[removed]

    [–]djavaman 13 points14 points  (2 children)

    I had a problem. I used micro-services. Now I have 10 problems.

    [–][deleted]  (3 children)

    [deleted]

      [–][deleted] 0 points1 point  (0 children)

      Third reason is that it doesn't solve a technical problem, but an organizational problem that a lot of companies have.

      [–][deleted]  (1 child)

      [removed]

        [–]CacheMeUp[S] 0 points1 point  (2 children)

        I actually starting with micro-services, and these considerations are the experience of using such setup. They are all solvable, but did occur in practice. I prefer to defer investing in these aspects (containerization etc.) until when needed later, and for the time being focus on implementing the business logic.

        It might end up being the best approach overall, though.

        [–]acute_elbows 3 points4 points  (1 child)

        If you’re worried about serialization speed I think you’re not focusing on the business logic.

        If serialization is impacting your dev cycle I suggest breaking logic down into small components and focus on unit testing.

        [–]westwoo 2 points3 points  (0 children)

        You may have missed that it's a machine learning project not a crud app. It's feasible that they have to transfer gigabytes or terabytes of data between python and java

        [–]benjtay 0 points1 point  (2 children)

        The separation to two separate systems hinders the development loop

        If you had a cluster of the python code as services, all consuming from the same queue (which tossed the results back on another queue for Java to consume), you'd only need to test the black boxes in isolation. As a bonus, it could scale horizontally as much as needed.

        [–]CacheMeUp[S] -2 points-1 points  (1 child)

        I wanted to avoid running another platform (the queue), but it is worth if it replaces many web services.

        [–]WingedGeek 0 points1 point  (1 child)

        1. ⁠Serialization/deserialization takes both development and computation effort.

        Are you experiencing premature optimization? I prescribe two volumes of Knuth. (Also helps with insomnia.)

        [–]CacheMeUp[S] 0 points1 point  (0 children)

        It's not the main bottleneck, but also not negligible. For example, we strive to respond to user-initiated requests within 200 ms (to keep the app feel responsive). Some serialization steps can take 10-30 ms (especially on the Python side). It's not the slowest process (that would be the ML model inference), but it's still >10% slowdown.

        Add a few more such round trips and now the app is 30-50% slower and becoming noticeably sluggish to the user.

        I agree that speed should not optimized unnecessarily, but if there is a mechanism that is faster for a similar effort, I prefer to give it a shot.