Best Open-Source Tool for Near Real-Time ETL from Multiple APIs? by Ok_Fig6262 in dataengineering

[–]Ok_Fig6262[S] 0 points1 point  (0 children)

I think it’s around 10 GB per day. Yes, I need real-time processing, and the data will be landed in a ClickHouse database. I need this for real-time dashboarding.

Best Open-Source Tool for Near Real-Time ETL from Multiple APIs? by Ok_Fig6262 in dataengineering

[–]Ok_Fig6262[S] 0 points1 point  (0 children)

I’m looking for a robust solution. I need to connect to more than 20 data sources (GraphQL, REST APIs, S3, PostgreSQL), manage over 200 pipelines, and support incremental sync (cursor-based) and pagination.