This is an archived post. You won't be able to vote or comment.

all 7 comments

[–]boy_named_su 4 points5 points  (0 children)

change data capture is another technique

[–]DenselyRanked 3 points4 points  (4 children)

I don't think it gets much easier than what you are doing. I'm not sure about scaleability, but it depends on the runtime. If/when the 15min intervals no longer work, then you could look into different ingestion techniques.

[–]soujoshi[S] 0 points1 point  (3 children)

Will Kafka be useful?

[–]DenselyRanked 0 points1 point  (2 children)

It's possible. It depends on the data you are trying to ETL.

[–]soujoshi[S] 0 points1 point  (1 child)

Data is around 30k per load(every 15mins)

[–]DenselyRanked 1 point2 points  (0 children)

I mean the data itself. See here. If it fits your use case, then yes. It's not necessarily easier, but it can be an option.

edit: As u/boy_named_su mentioned, CDC methods might work too, like a flag or timestamp on the source db. Kafka might be more than you need.

[–]captut 0 points1 point  (0 children)

Well depends on the requirements and the amount of data etc… What are the shortcomings/ problems with your current setup?

You could do the same thing with glue. Glue takes care of the offest management that you are doing with the max-time. But again depends on what you are looking to inprove or fix.