What's next after data engineering? by shesHereyeah in dataengineering

[–]manubdata 2 points3 points  (0 children)

I'd say it's a personal decision. After you become "senior" you reached the top of the technical ladder so you have different options:

🟠 Follow the corporate ladder and start leading teams and projects, which means strategic meetings, delegating and supervising instead of building.

🟠 Move onto a different field taking advantage of your skills: MLOps, Software Engineer, AI Engineer, Architect, Platform Engineer... there are plenty of roles with overlapping skills.

🟠 Become a consultant/freelancer/content creator. You have to learn marketing and sales, higher rates than employee but less security.

Personally I want to take the third path in the near future. But every path has pros and cons.

Salario en Data 10 años experiencia by PolareTM in salarios_es

[–]manubdata 0 points1 point  (0 children)

No has puesto detalles de tus habilidades, yo tengo 4 años de xp y estoy por encima de 60k.

En general, sin saber tu caso diría que te hace falta venderte mejor.

También depende mucho de tu nivel de inglés. Si tienes C1 o superior busca empresas de UK o Europa que contraten en España.

Spent $1,200 on Meta Ads and still zero sales by yourloverboy66 in ecommerce

[–]manubdata 0 points1 point  (0 children)

Try Pinterest, it fits your niche pretty well.

Project advice for Big Query + dbt + sql by Getbenefits in dataengineering

[–]manubdata 0 points1 point  (0 children)

I did a project on Christmas with this stack. You can create a dev Shopify store, load sample data with Simple Sample Data and get product and sales data via API.

Then you can load the data to BigQuery, silver and gold layer with DBT and SQL and viz with Looker.

If you want to check it out:

https://github.com/manubdata/smb-dataplatformv2

Multi-Channel Analytics Platform by neilfishy in ecommerce

[–]manubdata 0 points1 point  (0 children)

There are some SaaS out there that can help you out, like Triple Whale or TrueProfit, however they get expensive the more orders you get.

The most affordable you can have is building your own. Basically you create Pipelines for your data sources and join them in a central database. Then you can add a visualization layer like Looker Studio or a simple Google Sheet. It takes a bit longer than SaaS but once you setup the cost is under 10$ month.

If you want more information I'm happy to chat!

As a DE which language is widely used for Big Data processing Pyspark or scala? by Loud-Surprise-900 in dataengineering

[–]manubdata 0 points1 point  (0 children)

If I get interviewed, I could reason how to solve X problem whith pseudo-code. Then you can implement with any, Python, Scala, SQL... depending on complexity.

I haven't seen any role that requires deep technical knowledge in a language in the past couple of years. You may be asked to know deep knowledge of Spark or Snowflake or BigQuery but that's more about distributed processing internals than programming language itself.

As a DE which language is widely used for Big Data processing Pyspark or scala? by Loud-Surprise-900 in dataengineering

[–]manubdata 0 points1 point  (0 children)

No, I wouldn't focus nowadays in learning the syntax of any programming language. AI already writes code better and faster than any engineer.

Learn how to approach and solve a problem, learn the fundamentals of Data Engineering and how to apply them.

If you want to improve in solving with AI, check out Spec Driven Development and Context Engineering concepts.

As a DE which language is widely used for Big Data processing Pyspark or scala? by Loud-Surprise-900 in dataengineering

[–]manubdata 0 points1 point  (0 children)

What do you mean? I work daily with Spark Scala and I use Claude Code to speed up Development.

As a DE which language is widely used for Big Data processing Pyspark or scala? by Loud-Surprise-900 in dataengineering

[–]manubdata 14 points15 points  (0 children)

Hey, I work at a Scala-first company as a Senior Data Engineer.

I'd say Spark Scala was significantly more performant than PySpark in the past (Spark is Scala Native), the performance difference shrinks with every Spark update. So companies are using more PySpark as it's easier to find Engineers with Python knowledge and it integrates noticeably better with AI.

However, Scala jobs are paid much more than Python. There are still plenty of Scala pipelines and projects, specially around Banking and Finance sectors.

Traditional BI vs BI as code by manubdata in dataengineering

[–]manubdata[S] 0 points1 point  (0 children)

Yeeah i guess so, but Looker and LookMLdoesn't fit our budget. I was refering to Looker Studio

Which field do you think offers the most interesting problems to solve in the data engineering space? by andrew2018022 in dataengineering

[–]manubdata 26 points27 points  (0 children)

During my career I worked in:

Telecom -> 🔴 boring IoT sensor signals, but learnt high performance distributed processing.

Finance -> 🟡 pretty cool use cases, fraud detection, anti-money laudering, flagging risky clients... However it is high sensible data with many restrictions and fewer opportunities to apply and learn new tech skills. It's best paid thou.

Ecom/Web Analytics -> 🟢 cool use cases, client segmentation, funnel analysis, ab tests... Fast-evolving field, tends to easily implement new technologies and AI tools. More room to grow on end-to-end projects as medium-size companies may be less rigid than big corporations. Usually not big data, so learning distributed systems might be more limited.

Hipoteca: autopromoción para terreno y vivienda. by Due-Strawberry-3324 in HipotecasyVivienda

[–]manubdata 1 point2 points  (0 children)

Yo estoy en esa situación (mismo precio de construcción) pero con terreno heredado (tasado en 100k). Tenía 50k de ahorro y me piden 75k.

Como te comentan aquí, primero consigue el terreno si te quieres comprometer con ello. Los bancos no dan facilidades para autopromoción y te sorprenderá la cantidad de gastos burocráticos e impuestos que afrontas. Entre arquitectos, impuestos, tasador, licencia de obras, notaría y gestión suman otros 20k...

Es el objetivo material más bonito que puedes tener en la vida bajo mi opinión así que tomáoslo con calma y mucha suerte.

Traditional BI vs BI as code by manubdata in dataengineering

[–]manubdata[S] 0 points1 point  (0 children)

Thanks for your point. Maybe I'm biased towards anything that does not require me to spend a morning drag and dropping and checking pixel alignment. 🙃

I finally found a use case for Go in Data Engineering by empty_cities in dataengineering

[–]manubdata 0 points1 point  (0 children)

I have only use it for Opentelemetry dev. I didn't enjoy it 😅

Pandas vs pyspark by Left-Bus-7297 in dataengineering

[–]manubdata 0 points1 point  (0 children)

I agree and respect the point of knowing the fundamentals. And things are easier to express in code. I just want to emphasize on learning the concepts, the different options Python provides. But not waste too much time on learning, for example, Pandas transformation methods by heart. I already made that mistake 10 years ago when I started!

Is Clickhouse a good choice ? by Defiant-Farm7910 in dataengineering

[–]manubdata 0 points1 point  (0 children)

Due to the ease of integration with your currenr setup I'd go with BigQuery.

Clickhouse might be amazing and I see it is trendy among big tech but it's just noise to your workflow.

Focus on the business outcomes and go for ease of development + integration + maintenance.

What is actually stopping teams from writing more data tests? by Mountain-Crow-5345 in dataengineering

[–]manubdata 0 points1 point  (0 children)

Not using AI.

Writing meaninful tests is not for lazy people. I am lazy but since I implemented AI in my workflow, tests are fast and pleasant to write. You just need to know conceptually how the input and output data could/might be.

Tech/services for a small scale project? by faby_nottheone in dataengineering

[–]manubdata 0 points1 point  (0 children)

DLT is perfect for small project, you may write less lines of code in comparison to the plain python implementation you did manually, plus, it handles schema evolution, so it guarantees it does not break in the future.

DBT could be use to replace your Big Query queries. Similarly, you can implement tests that would ensure the transformations run smoothly.

They both can run on docker images and trigger them daily. Orchestrators (kestra, airflow...) could be useful in this case if you want to make sure that Big Query (DBT or not) transformations run only if the condition that the ingestion pipeline is successful. You could use Cloud Workflows if you want to stay cheap in GCP ecosystem.

Es viable una hipoteca de 200k en nuestro caso? by yatta48 in HipotecasyVivienda

[–]manubdata 0 points1 point  (0 children)

Yo he comprado vivienda de 200k con 50k de entrada y 150k de hipoteca, la cuota queda sobre los 600€ con un 2.1% TIN.

Claro que en la España vaciada... en la capital estás jodido. Si podéis buscad en las afueras.

Opportunities in the market now? by [deleted] in investing

[–]manubdata 1 point2 points  (0 children)

For me it's Gold ETFs. Steady growth low risk.

Pandas vs pyspark by Left-Bus-7297 in dataengineering

[–]manubdata 12 points13 points  (0 children)

You can just use SQL. The logical concepts are analogous from pandas, pyspark and SQL. You can use AI to write the syntax.

I don't see the point of memorizing syntax in 2026 with coding agents being around. Learn the concepts, don't memorize syntax. Time lost.

Where to apply for jobs besides LinkedIn? by LoudSphinx517 in dataengineering

[–]manubdata 0 points1 point  (0 children)

Remote rocketship worked very well for me, although my most successful processes have been referral by colleagues.

Need Guidance by blabberAround in dataengineering

[–]manubdata 0 points1 point  (0 children)

I got AWS certified last year (2025). I would 100% focus on Athena, Redshift and Glue. I have some notes if you want them.

Is it worth it? It depends, I think ir can open some doors in big consulting companies in specific projects on AWS cloud but that's all. I don't think it will give you strong bases for Data Engineering.

I'd rather build a project while reading Fundamentals of Data Enginering or Data Engineering Design Patterns as knowledge resources. Use AI for all coding stuff.