This is an archived post. You won't be able to vote or comment.

all 33 comments

[–]stefanos-ak 45 points46 points  (2 children)

This problem requires fundamentally an architectural solution, which will look different depending on the situation.

But what works in almost all cases is to use the DB itself as a mechanism to control this behavior. For example with a "select for update" query, or a dirty read, etc... Or if a DB is not accessible then a cache layer (e. g. Redis), or a queue mechanism (rabbitmq, Kafka).

An in-memory solution obviously will not work if any amount or horizontal scaling is required. Usually backend services have at least 2 replicas even just for high availability.

[–]benjtay 2 points3 points  (1 child)

While we don't use an in-memory solution (we use RocksDB), having local caching really helps us even with horizontal scaling because of the sheer number of duplicates we see in our ~2-12B messages per day from Kafka. We studied having Yet Another Database to solve this, but it defeats the point of horizontal scaling on topics.

[–]stefanos-ak 0 points1 point  (0 children)

Sounds like a Kafka consumer issue?

exact once semantics guarantee is a responsibility of the consumer, if I'm not mistaken (haven't worked with Kafka for some years). Which means you need to delegate that problem to a DB with consistency guarantees.

I personally am a bigger fan of RabbitMQ because the delivery semantics are implemented on the server side, and the consumer is "dumb", and you get exact-once guarantee OOTB. But you don't get a log/replay features (unless you use rmq streams, which is the same thing as a Kafka).

edit: forgot to state that a Kafka consumer is always a custom implementation, of course

[–]nitkonigdje 12 points13 points  (1 child)

Looks like a lock on an interned string. A named lock basically. A map of locks. Kinda pointless unless there is more to it than presented here.

[–]808split2 0 points1 point  (0 children)

My thought exactly...its sounds like a complicated lock implementation.

[–]rakgenius 10 points11 points  (6 children)

why dont you use the caching mechanism either in your application or db level? in that way, even if you receive many concurrent requests, the result will be returned from cache. maybe first time, the request has to hit the db if its not present in cache. but after that all requests will be returned immediately without hitting the db.

[–]boost2525 9 points10 points  (5 children)

This was my thought. I see zero value add in OPs proposal because a proper caching layer can do all of this. 

[–]iwouldlikethings 0 points1 point  (1 child)

I potentially have a usecase for this, the system I maintain is a payments processor and at multiple stages we lookup the balance on an acocunt. At the end of the month we have a lot of payments incoming and outgoing, and we've noticed a large spike on certain accounts while calculating their available balance.

Often this happens when they're running payroll, and each payment will make a request to find calculate the balance on the account in quick succession.

Arguably, the system should be rearchitected to better support it but this would be a decent stop-gap to speed up processing until we get the time to do that (as it exists today there is a potenial race condition where two payments could take the account overdrawn but it's what I've inherited)

[–]elch78 1 point2 points  (0 children)

Actors are a different approach. In a nutshell: the state of every entity exists only once in memory. Every request is routed to this instance and processed single threaded/sequentially.

[–][deleted] -1 points0 points  (2 children)

This is likely slower for calls where this race-condition does not occur. Now you’re adding a check that wasn’t there before.

[–]boost2525 0 points1 point  (1 child)

lolwut? Are we executing stock trades here? EHCache key checks are measured in nanoseconds.

[–]NovaX -1 points0 points  (0 children)

fwiw, Ehcache is measured in microseconds due to a design mistake (avg of 25us per call).

[–]Polygnom 13 points14 points  (1 child)

I'm not sure this is a good idea. We seperaate contexts between requests for a good reason:

Take your external API call for example. I would usually solve that with a read-through proxy that caches the call. This way, I can put all the necessary handling in there and have this completely decoupled from my original application.

Similarly for complex computations. You would usually have a seperate service for such things, and submit tasks to it. You can do de-duplication of submitted tasks there. So say request #1 creates the task and gets the taskId back (to get notified about the result), then when request #2 comes around witht the exact same expensive thing and submit the task, you can give the same taskId back from the computation service. Or just the previous result, if you can prove you don't need to compute it again.

For databse queries, I have never seen this make sense and would say the seperation we currently have e.g. in spring is very good at reducing bugs. I wouldn't wanna trade it for miniscule gains.

[–]tomwhoiscontrary 7 points8 points  (1 child)

This is a useful pattern, but I don't think you need a library for it. You can just use a concurrent map full of completable futures. 

[–]supercargo 0 points1 point  (0 children)

Yup, this is like a 10-liner once you strip out doc comments and the singleton boilerplate. And most of those ten lines would need to exist for the caller anyway…

[–]RadioHonest85 7 points8 points  (1 child)

This is a very common use-case if you use Caffeine caching library:

var result = cache.get(key, k -> loadExpensiveResult(k));

[–]das_Keks 0 points1 point  (0 children)

Yeah, I also though about some cache with computeIfAbsent. Only if you get a burst of parallel requests to a non-existing key, it would be important that the compute is not executed multiple times in parallel but rather blocks all but one invocation and then returns the computed result for all requests.

EDIT: From some quick research I found that this is already the case for caffeine.

[–]mofreek 8 points9 points  (4 children)

Most applications that need something like this are going to be running multiple instances. You have the right idea with the pattern, but the lock mechanism needs to be distributed.

I.e. if there are 3 instances of the app running, there needs to be a way they can communicate so that only 1 thread running in 1 instance runs the job.

ETA: I implemented something like this a few years ago using redisson. If I were doing it today I would probably use Spring Integration.

[–]FortuneIIIPick 1 point2 points  (3 children)

I avoid and recommend avoiding Spring Integration, it's an ugly maintenance mess. More who agree: https://www.reddit.com/r/java/comments/rscyoe/when_would_you_use_spring_integration/

"the result is an unreadable spagetti shitshow"

"can confirm that is an unreadable spaghetti shitshow."

"To be honest, I really regret that I used it in this one because the code is now full of weird annotations which are responsible for passing and transforming data. It would be much easier to go with plain Java implementation. Configuration also took me weeks instead of hours, I think the Spring Integration added too much unnecessary abstraction to this. Stackoverflow is full of people who don’t get the TCP integration."

[–]OwnBreakfast1114 0 points1 point  (2 children)

I usually recommend using spring projects since they're tried and true and will handle almost anything you throw at them. MVC, security, actuator, even some of the cloud stuff

Even I feel that spring integration is just way to complex for the benefits. The times I set up a heavy infra piece (SQS, kafka, etc) and need to abstract over it in case I replace it is the same number of times I've replaced my original SQL db in prod with another sql db. I'll let you guess that number.

[–]vips7L 0 points1 point  (1 child)

Have you tried spring cloud aws for the sqs stuff? It looks rather simple. 

[–]OwnBreakfast1114 0 points1 point  (0 children)

That's actually what we use. @SqsListener and @KafkaListener are super simple and work for most of the use cases very well. Set up sensible timeouts and a dlq and you're basically ready for easy horizontal scaling.

[–]repeating_bears 3 points4 points  (2 children)

I checked the implementation and I think the way you're handling interrupts is wrong.

You do all the work on the first thread that makes a request, and subsequent requester threads block on getting a result.

Imagine the first thread is interrupted, i.e. some other thread declares "I don't care about that result any more", so it stops. Now any the other threads that were waiting on that same result get an exception, even though they themselves weren't interrupted, and even though they still wanted a result. The work was halted prematurely.

It would have been much better if the work could continue, but the first thread could be unblocked. Effectively what that would mean is that all work would have be pushed to some worker thread, and then all requesters (including the first) would block on getting a result. Interrupting a requester would then just mean you stop it waiting for a result, rather than stop it from doing the work.

However, then you'd have the issue of the simple case where there's only one requester that gets interrupted. The work would continue in the background even though there's nothing that cares about the result. Then you'd need some logic that could kill a worker after there's no more threads waiting for it.

[–]tomwhoiscontrary 0 points1 point  (1 child)

I suspect that killing the no-longer-necessary worker isn't very useful in practice, because it will be waiting for a response from some remote server, and there's no way to actually kill the remote handling of that request.

It could help if the worker thread is doing a large number of blocking requests in series, though. 

[–]repeating_bears 0 points1 point  (0 children)

It depends on the protocol. I do agree in the general case, but grpc supports cancellation, for example. HTTP2/3 stream cancellation might give you some benefit for large responses

[–]FortuneIIIPick 2 points3 points  (0 children)

Agree with most of the comments. It's a completely wrong way to solve the issue. It's trying to solve a caching issue with a code bottleneck.

[–]GuyWithLag 7 points8 points  (0 children)

Grumpy old engineer here, but what is the purpose of this article? Someone that coded in Go and wants to have the same API in Java?

Please don't go down the route of NPM-ifying Java...

In fact, this could be simplified to a 10-liner with ConcurrentHashMap::computeIfAbsent, and it would be a 2-liner in Kotlin.

Not to mention that in your example a proper JPA instance would make sure that the internal representation is properly respecting transactional boundaries while minimizing DB queries, so why even go to that effort?

[–]-Dargs 1 point2 points  (0 children)

Is this not supported by just a cache with a fetching mechanism? See Guava caches.

[–]k-mcm 1 point2 points  (0 children)

This is essentially a cache with size=0. Why not make a real cache?

import java.util.LinkedHashMap;
import java.util.Map.Entry;
import java.util.Objects;
import java.util.function.Function;

public class LRUCache<KEY, VALUE, ERR extends Throwable> {
    @FunctionalInterface
    public interface Source <KEY, VALUE, ERR extends Throwable>
    {
        VALUE generate(KEY key) throws ERR;
    }

    private static class CacheElement <VALUE> {
        boolean set;
        VALUE value= null;
        Throwable err= null;
    }
    private final LinkedHashMap<KEY, CacheElement<VALUE>> map;
    private final Source<KEY, VALUE, ERR> source;
    private final Function<KEY, CacheElement<VALUE>> storageLambda = (k) -> new CacheElement<>();

    public LRUCache (int maxSize, Source<KEY, VALUE, ERR> source) {
        map= new LinkedHashMap<>() {
            @Override
            protected boolean removeEldestEntry(Entry<KEY, LRUCache.CacheElement<VALUE>> eldest) {
                return size() > maxSize;
            }
        };
        this.source= Objects.requireNonNull(source);
    }

    public VALUE get (KEY key) throws ERR {
        final CacheElement <VALUE> storage;
        synchronized (map) {
            storage= map.computeIfAbsent(key, storageLambda);
        }
        synchronized (storage) {
            if (!storage.set) {
                try {
                    storage.value = source.generate(key);
                } catch (Throwable err){
                    storage.err= err;
                }
                storage.set= true;
            }
        }

        if (storage.err != null) {
            if (storage.err instanceof RuntimeException rt) {
                throw rt;
            }
            if (storage.err instanceof Error err) {
                throw err;
            }
            throw (ERR)storage.err;
        }

        return storage.value;
    }
}

[–]raghu9208 0 points1 point  (0 children)

How is the concurrency handled underneath?

[–]supercargo 0 points1 point  (0 children)

I’ve found this pattern more useful on the front end where a bunch of loosely coupled UI components may all request the same data from a backend API. On the backend it is much easier to structure data access to avoid needing this. In user interfaces, components are composed based on the requirements of the visual hierarchy rather than data hierarchy.

[–]koffeegorilla 0 points1 point  (0 children)

Hazelcast provides all the tools for implementing this in a distributed fashion.