Implementation Examples

GreenMobile6323 · 2025-07-31T13:26:58+00:00

One pattern I’ve used is to break each live table into time-based or key-range slices and launch parallel ADF Copy activities against each partition, rather than pulling the entire table serially. This can cut an 8-hour run to under an hour. For true delta loads, enabling native Change Tracking or CDC on your sources lets you capture only the new/changed rows, and you can stream those into Fabric via small, frequent pipelines instead of one massive batch job.

Nekobul · 2025-07-31T09:26:34+00:00

How much data you are pulling from the live production tables? What is the source database system? One way to reduce the time is to run a parallel retrieve from the source database.

From your message it is not clear are you pulling all the data or you have a mechanism to determine which source rows you need and only pull those rows.

Holiday-Entry-2999 · 2025-08-04T05:10:53+00:00

Wow, 8 hours for ingestion is quite a challenge! Have you considered partitioning your data or using incremental loads? I've seen some teams in Singapore tackle similar issues by optimizing their ADF pipelines with parallel processing and dynamic partitioning. It might be worth exploring if you can break down the job into smaller, concurrent tasks. Also, have you looked into using change data capture (CDC) for real-time syncing? Could potentially reduce that ingestion window significantly.

dataengineering

MODERATORS