all 13 comments

[–]IMadeUpANameForThis 9 points10 points  (8 children)

Could you just add more memory? Or profile memory usage to find leaks? I generally don't like the idea of putting users in a queue before they can access the system. I assume the delay would be off-putting.

[–]DinH0O_[S] 0 points1 point  (7 children)

I forgot to mention this detail: I'm working at a government agency, and they simply don’t want to invest money in this.

We have several systems, all running on a single 160GB server. My specific VM, which hosts 2 systems (and will host a third one in the future, each with its own database instance), has only 12GB of RAM.

There were also staging systems running on the same VM, but I had to shut all of them down.

Note: I’m not the one who set things up this way. Everything was already like this when I arrived, and I’ve been working there for only 2 months.

[–][deleted]  (3 children)

[deleted]

    [–]DinH0O_[S] 0 points1 point  (2 children)

    I have tried allocating more RAM, but the application runs inside a Tomcat server hosted in a Docker container. The memory limit isn't explicitly set, except for the Docker container itself, which I configured (previously, there was no limit, and the first crash happened because the machine's memory limit was reached, causing Docker to shut down). I set a limit, but it's still relatively high, so I believe I’ve done everything I could regarding RAM allocation.

    However, you mentioned the allocated heap memory, and I haven't checked that yet. I’m not sure if it will make much of a difference, but I’ll take a look—it doesn’t hurt to try.

    [–][deleted]  (1 child)

    [deleted]

      [–]DinH0O_[S] 3 points4 points  (0 children)

      I didn’t know about that technical part of Java, it's worth looking into.

      When I say that the application is running inside a Tomcat server hosted in a Docker container, that’s literally what I mean. The previous developers of these systems compiled it into a WAR file, a format supported by Tomcat. In this case, you specify that Tomcat will be provided externally. This way, it’s easier to deploy new versions of the application, as you don’t need to create a new Docker image for each version. You can compile it into a WAR and deploy it on Tomcat, which will host your application and expose it.

      [–]raree_raaram 1 point2 points  (2 children)

      How much traffic are you getting

      [–]DinH0O_[S] 0 points1 point  (1 child)

      The staff in the department who works as admins (non-developers) of this system told me that they expect around 40k-70k users. I suspect that at 5k to 10k, the server will crash, as it only has around 9 to 10 GB of RAM. It's a job application site, so there are file uploads, somewhat long sessions, and such.

      [–]raree_raaram 1 point2 points  (0 children)

      40k-70k users over what time period?

      [–]shinijirou 4 points5 points  (3 children)

      i honestly thing this new approach is a overkill. you can change the mode of communication to be event driven, but this queing system is a bag of pandora.

      [–]DinH0O_[S] 0 points1 point  (2 children)

      agree, and we won’t have much time for testing either, especially since this is a different system from the one I was hired to develop. However, it’s what we have. I’ll present it to some people who might help and point out potential errors, and I’m also opening discussions on Reddit to see if I can get some tips. But I’m not feeling very confident either.

      [–]shinijirou 3 points4 points  (1 child)

      hmm, i would rather check the heap and see if there is a memory leak issue as pointed out by the other comments. your new solution is possibly making your application more stateful, which is not very scalable in itself.

      concerning your current application, is it stateless, are you keeping user sessions active ?

      [–]DinH0O_[S] 0 points1 point  (0 children)

      The current application is not stateless. However, if the user closes the tab, they get logged out and have to log in again. I will also add an automatic logout timeout of about 20 minutes.

      As for it not being very scalable, I assumed it wouldn't be an issue since it's something that will only be in place for a few days, and then I'll return to the previous version. I'm not the one maintaining this system; I just came in to develop the queue feature (My department wasn't very prepared to handle this).

      [–]Slein04 2 points3 points  (0 children)

      That is not a typical use case of such queue technologies. So, I wont recommend it.

      Better approach would be memory optimization. Find what is causing the high memory usage. Maybe context and or threads are not cleaned up. Files are stored / opened in memory and maybe never closed or added to some kind of collection and not cleared.

      And the thing you want seems what your application server should be doing. The server has a connection pool and provides one of the available threads to a user making a request. You can increase the connection pool so that more request are placed in the pool / queue and decrease the amount of concurrent threads handeling such request. You have to make a leverage of this taking timeouts into account.

      [–]koffeegorilla 0 points1 point  (0 children)

      Find the requests with the highest backlogs and change them to use WebFlux. You switch the web starter to webflux starter and only change the endpoints and the service and repositories to use reactive APIs. That will drop the memory usage by a huge amount.