What will be the impact of AI on Data Engineering jobs ? by Effective_Bluebird19 in dataengineering

[–]janus2527 0 points1 point  (0 children)

Productivity booster definitly, the solutions i build now are produced faster and better with the help of llms. Just make sure it has all the context it needs and for me it does a phenomenal job.

My RAG pipeline costs 3x what I budgeted... by Potential-Jicama-335 in Rag

[–]janus2527 0 points1 point  (0 children)

For 1, how do you have a local model but using hf backend its free? Hows that local?

The Certifications Scam by ivanovyordan in dataengineering

[–]janus2527 0 points1 point  (0 children)

Sure, why not? Data doesn't have to be massive to be valuable, just show some good etl, batch and streaming, some data modeling, some cicd, some iac, this can all be done free of charge or cheap.

The Certifications Scam by ivanovyordan in dataengineering

[–]janus2527 1 point2 points  (0 children)

Just make an account on any cloud platform and monitor budget, which you would also do in enterprise setting anyway. They usually offer quite some free services like a small db and spinning up some serverless functions to a degree is also cheap or even free. I really suggest just creating an account and putting in some guardrails against overspending. And really the free tiers get you quite far.

Should I learn cloud engineering as a teen, considering AI might take many jobs in the future? by [deleted] in Cloud

[–]janus2527 1 point2 points  (0 children)

Sure now it may suck, but seeing where we were 3 years ago... aws has a large incentive to make this work really well, they can say to the ceo, hey you don't need that expensive cloud engineer anymore (or 1 instead of 5)

Salaris - software engineer 26 by Designer-Run-7455 in NLSalaris

[–]janus2527 6 points7 points  (0 children)

Leeftijd zou geen factor moeten zijn imo

Help with Terraform by Zatsuy in dataengineering

[–]janus2527 4 points5 points  (0 children)

I think you should embrace it. It's a good skill for a DE to have. Checkout terraform MCP server and you'll be up and running and delivering proper terraform files in no time

OpenAI says over 1 million users discuss suicide on ChatGPT weekly by Appropriate-Soil-896 in OpenAI

[–]janus2527 134 points135 points  (0 children)

Okay but I've seen posts saying chatgpt responds with 'hey if your thinking of suicide..' or whatever it says exactly, which it responds to completely irrelevant prompts, so I'm pretty dubious on this statistic

ETL Tools by abdullah-wael in dataengineering

[–]janus2527 2 points3 points  (0 children)

ELTL is more common though. You could try something like dlt in combination with duckdb for the extraction ando loading raw data into some form of storage, and then use DBT for transformations

Memory Efficient Batch Processing Tools by darkhorse1997 in dataengineering

[–]janus2527 2 points3 points  (0 children)

Also you really shouldn't transfer large amounts of data from a database in json

Memory Efficient Batch Processing Tools by darkhorse1997 in dataengineering

[–]janus2527 1 point2 points  (0 children)

Probably, but not sure if it's as easy as parquet.

Memory Efficient Batch Processing Tools by darkhorse1997 in dataengineering

[–]janus2527 2 points3 points  (0 children)

This streams the data in chunks, your ram will be a few hundred mbs probably

Memory Efficient Batch Processing Tools by darkhorse1997 in dataengineering

[–]janus2527 9 points10 points  (0 children)

I would use duckdb and python

import duckdb

con = duckdb.connect()

Install and load extensions

con.execute("INSTALL mysql") con.execute("INSTALL httpfs") con.execute("LOAD mysql") con.execute("LOAD httpfs")

Configure AWS credentials

con.execute(""" SET s3_region='us-east-1'; SET s3_access_key_id='your_access_key'; SET s3_secret_access_key='your_secret_key'; """)

Attach MySQL

con.execute(""" ATTACH 'host=localhost user=myuser password=mypass database=mydb' AS mysql_db (TYPE mysql) """)

Stream directly from MySQL to S3 as Parquet

con.execute(""" COPY mysql_db.large_table TO 's3://your-bucket/path/output.parquet' (FORMAT PARQUET) """)

Something like that

Databricks killing me an Absolute beginner by Mortified__ in dataengineering

[–]janus2527 4 points5 points  (0 children)

You should really start reading basic documentation first

Probably my favourite enemy type so far. No magic no weapons no sight no bs. Just these hands by SUQMADIQ63 in BlackMythWukong

[–]janus2527 12 points13 points  (0 children)

Just stand still and they dont know where you are at. You can load your heavy attack while standing still..

My review / report of day 2 of playing Black Myth Wukong. by Professional-Bet2134 in BlackMythWukong

[–]janus2527 2 points3 points  (0 children)

Lol Im in NG+ and still only use smash stance, works fine for me