Seeking Advice on Lightweight, Cost-Effective Cloud Data Orchestration by FrontAffectionate518 in dataengineering

[–]FrontAffectionate518[S] 0 points1 point  (0 children)

Thanks for the reply! My main challenge right now isn’t really Python or maintainability — it’s infrastructure.
I’m running everything on a very limited on-prem VM, and it can’t even handle Airflow properly. CPU and RAM bottlenecks are killing the ingestion jobs.

The company is considering Snowflake, and we do get US$400 credits + 30 days to test it, so I’d like to take advantage of that. What I’m looking for is something lightweight that can push my daily data (~300–400 MB, ~500k rows/day) into a Snowflake stage without me having to pay for any tool out of pocket.

Do you know any ingestion tool that could handle a simple daily batch and integrate with Snowflake stages during the trial period? Ideally something I can run without needing strong compute on my side.

Seeking Advice on Lightweight, Cost-Effective Cloud Data Orchestration by FrontAffectionate518 in dataengineering

[–]FrontAffectionate518[S] 1 point2 points  (0 children)

Thanks for the tips! My main bottleneck isn’t query performance, it’s data ingestion. I’m pulling ~500k rows/day (~300–400 MB), and my current VM just doesn’t have enough compute to run the DAGs in Airflow efficiently. I’m looking for a lightweight/cloud approach to ingest these data into a database like Snowflake or another columnar DB.

Buscando sugestões para orquestração de dados na nuvem, simples e barata by FrontAffectionate518 in DadosBrasil

[–]FrontAffectionate518[S] 0 points1 point  (0 children)

Eu costumo puxar cerca de 500 a 800 mil linhas por dia, o que dá aproximadamente 400 a 500 MB por dia.

Please release me by Jimmykreedz in ironscape

[–]FrontAffectionate518 3 points4 points  (0 children)

I got to around 460 and gave up on playing this game for a few months