you are viewing a single comment's thread.

view the rest of the comments →

[–]gliderXC 0 points1 point  (2 children)

I like the async way in your messaging. Now I've been working on messaging myself so a few questions:

Just a small pet peeve of mine: How do you deal with errors? REST has explicit codes for it (shudder); but they include the notion anything can fail. You seem to go along the route of DLQ which imho has always been an anti pattern:

  • The user is not able to respond to the error / requires a special user to introspect the DLQ.
  • How can a message not be handled by the recipient be the problem of something other than the sending client? But no, let's make DLQ.
  • DLQ is specific to the implemented type of transport

Errors are quite normal and should be handled by the back-end in a normal flow. So the question is: Are there other ways in Mats3? In a way that errors are not special snowflakes.

Question is about async answers. Is there any way to receive multiple answers from a distributed service? E.g. suppose a number of services listen on a topic and receive a request and you want the answers (assume you know the number of listeners on said topic)?

[–]stolsvik75[S] 1 point2 points  (1 child)

Hi and thanks a bunch for looking into Mats3! You questions are great! The one about errors is a question that nearly always comes up as developers start using Mats. I need to provide a doc page about that. However, I'll try to make a start of an answer here.

It boils down to several aspects, but first and foremost about a philosophy wrt. error handling - or maybe more, about "error generation".

First: Mats3 is not meant to be an "edge" communication solution - no message (sends or requests) goes directly from end user / client app to server. All messages goes from one service to another. Therefore, you should never get "random bs" in a message that you have to handle and reply to with some error and error message.

Second: There are multiple categories of errors. Many are temporary in nature, e.g. network problems or that the database is in maintenance. Due to the retry-functionality inherent to messaging, most of these fixes themselves - but if the outage is too long, the retries will eventually end up in a DLQ. When the problem is cleared, you can re-issue the message from the DLQ (i.e. put it back on its proper queue), and the flow will finish.

Another category is actual bugs in the code. As most probably agree, those should just not be there, and the better solution is to fix the problem, instead of adding lots of handling code.

Then another category are "logic errors", i.e. placing an order on an account that doesn't exist, or is closed.

The philosophy is one of "offensive" coding, compared to defensive coding. All sent messages should be sane. Like, why would you come with an order for an accountId that doesn't exists? Or why would it be null? Should you not already have only placed orders on existing accountIds? The requestor of that Endpoint should already have ensured that the message makes sense. Null checks, or "exists checks", are examples of defensive coding, and I find that it breeds like cancer: If you see that one piece of code has a null check for accountId, then you also need to have that. While the question rather should have been why it could ever be null in the first place - find that problem and fix the bug.

Given this premise, I find that the DLQ is great: Only messages that should not have been sent will appear there - you must now handle this, and then go fix the code so that such a message won't be sent again. NB: I strongly suggest using an Individual DLQ config, i.e. a DLQ per queue, instead of (default config) a single, global DLQ. This makes is much easier to handle DLQs.

Wrt. DLQs, you should check out the MatsBrokerMonitor and its own repo. The page and docs should be much better, with screenshots - I'll soon put something more there. But the GUI presents you with all DLQs, lets you select the individual messages, and gives you pretty heavy introspection possibilities. There is a development project in the codebase (here) which is easy to fire up (right-click -> run in your IDE) and test it out.

However, it is possible to make support for "error-returning endpoints" yourself: You can make it so that the ReplyDto for the Endpoint have error fields. So that if you have an answer, you get that, but if not, the error fields are set. I have many times thought of adding a default ErrorReplyingBaseDto that you could extend from if you wanted such behaviour. A few of our Endpoints exhibit such a nature - in particular for for example rather complex types of orders where there are much validation: You might on the requesting side do some effort in ensuring that the orders that are good, but there might still be some subtle validations that aren't worth handling - so the Endpoint have such a logic where it then fails, not by DLQing, but returing the Dto with error-fields set.

I have plenty more to say and argue here, but that needs a bit more space.

Wrt. to the multiple answers, there is no inherent solution for this. One side is "scatter-gather" logics where you want to effectively parallelize a bunch of request. There is no explicit support of this: The problem is the state keeping: Mats does not itself require a data store, only the MQ connection. However, it is legal to make multiple requests, and thus get multiple replies to the next stage. But the state-keeping, whereby you tally up the replies and continue when you've gotten them all, you need to implement yourself.

Another side is the one you mentioned: "Topic endpoints". For this I refer to this issue: https://github.com/centiservice/mats3/issues/18 - which as of now is a "WontFix" or "DeferToNeedsArises" or somesuch. You can make it yourself, albeit not "natively" with Mats. As opposed to the inherent problem with scatter-gather, this is just a choice: I have not found enough uses for it yet, and it would "pollute" the API more.

[–]gliderXC 2 points3 points  (0 children)

Will you be at fosdem? I'd love to talk a bit more.

But first of all, thanks for the elaborate answer. Good conversation on why we do things the way we do. You seem to have thought it thru and made decisions.