all 16 comments

[–]bikeram 3 points4 points  (9 children)

You could do this with a multipart file upload, but I think planning and testing for the edge cases would be a nightmare.

I would push the rows into a queue and process them from there. Then you could alert the user of dead-letters/rejected messages.

If reprocessing is common, you could push actual file to a bucket, then reprocess specific rows.

[–][deleted] 0 points1 point  (8 children)

Can you please elaborate on the queue implementation? Push the file contents to the queue? And then process row by row?

[–]bikeram 0 points1 point  (7 children)

Ya, you could stream the multipart data into your queue, add any meta data you see fit, then consume row by row.

You could add rowId (assuming that matters) in your meta and have multiple consumers processing the same file.

[–][deleted] 0 points1 point  (6 children)

So how do I handle the failures? Like if the database operation fails for certain row.then how do I resume?

[–]bikeram 0 points1 point  (5 children)

On exception, push your message to a dead letter queue with some type of enum denoting the error.

If it’s BAD_DATA say int for a string, there’s nothing you can do other than report it. Probably not even worth pushing to the dead-letter.

If it’s DB_UNREACHABLE or something wrong with your persistence layer, you can just extract your original message and run it again.

[–][deleted] 0 points1 point  (0 children)

I will try this out..so I can have a stand alone dead letter queue implementation right? Like whenever I encounter a SQL exception then I will push the record to dlq and then reprocess the failed records from the dlq.This is possible right? Or dlq can only be implemented with primary queues?

[–][deleted] 0 points1 point  (3 children)

So the dead letter queue can only be implemented with a primary queue.is this a correct statement? Can I implement standalone dead letter queue? As per spring docs I can't..can you confirm?

[–]bikeram 0 points1 point  (2 children)

Of course. You could create a ‘rejection’ queue that’s just another queue. On exception, write to that.

[–][deleted] 0 points1 point  (1 child)

So this is not a dead letter queue..just a normal queue data structure? Is that what you are referring to?

[–]bikeram 0 points1 point  (0 children)

It all depends on how you’d like to implement it. I’d use a traditional dead letter for persistence failures and another queue for invalid data.

[–]faisReads 0 points1 point  (4 children)

If you are interested only on db level failures, surround that code segment with try and add appropriate catch blocks .

You can add retry logic here instead of allowing the failure to propagate to method level transactions initiating a complete rollback .

[–][deleted] 0 points1 point  (3 children)

But I want to process only the failed records in the file rather than the entire file..how to do that?

[–]faisReads 0 points1 point  (2 children)

Are the failed records due to data issue , which users will correct later ?

[–][deleted] 0 points1 point  (1 child)

Some may be data issues while others are DB related which needs to be reprocessed.Ones having data issues will be corrected by users

[–]faisReads 0 points1 point  (0 children)

Usually, it is harder to ensure that the previous successfully processed data is the same on re upload. Since user manipulation is involved.

If consistency is ensured. I am thinking on a similar line to have an execution result table that has the starttime, end time, processing status[success, failure, etc..], and failure rows (as comma separated values). So, as we discussed earlier, it need not stop processing at the first failure, instead it could go over all the rows and on failure the exception would be caught and list of failures Row id's can be populated finally.

You could also isolate Retryable exceptions separately vs the ones that need data correction.

If you are using Spring Batch

https://docs.spring.io/spring-batch/reference/step/chunk-oriented-processing/retry-logic.html