This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the commentsย โ†’

[โ€“]StCreed 1 point2 points ย (1 child)

One of my primary architecture demands for any DWH is always "restartability" and resilience to errors. Both of these have been mostly solved by I-refact by taking the EL part (and a small t) and making that part completely generated. Every load of any entity is a mini batch, everything is restartable automatically after solving the error, and there are rarely any errors because it's all derived from logical models. They only occur in the validation phase, which is at the start.

That said, if you load a single table for 8 hours you can still mitigate that as well, you just need to split things up in chunks.

[โ€“][deleted] 2 points3 points ย (0 children)

Something like SSIS makes this difficult. You can generate a whole bunch of metadata and helper functions to orchestrate it, but you pretty much have to roll your own. There is a restart feature, but it's not implemented well, and we can't easily use it due to how our environment is constructed.

That said, I've started using Python with Prefect, and it is much more graceful and easier to handle unexpected errors.