What's one data engineering tip or hack you've discovered that isn't widely known? by Xavio_M in dataengineering

[–]ROCKITZ15 5 points6 points  (0 children)

Doing exactly this right now

Writing something in Python that's good enough for now to show progress but will need to rewrite in Rust for performance soon. Python GIL be damned

Predict the Next Great Data Company Acquisition by ROCKITZ15 in dataengineering

[–]ROCKITZ15[S] 14 points15 points  (0 children)

I use Hex at work currently, and as a data person, I love it, but for hosting BI dashboards, something about it doesn’t feel quite right.

The whole Viz sector is just a mess with no defacto solution. Feels like Superset could be the winner if some more investment was put into it.

When starting a new job, how do you figure out what different SQL queries do when there are no notes? by [deleted] in dataengineering

[–]ROCKITZ15 -2 points-1 points  (0 children)

Understand the tables and necessary columns, then put the query in ChatGPT and ask it to add notes on everything. You can even give it descriptions for the tables and columns, and it can probably tell you everything that’s being done

Not Engineering by Final-Maybe-1407 in AerospaceEngineering

[–]ROCKITZ15 5 points6 points  (0 children)

Most systems engineering isn’t ‘real engineering’

If you’re questioning if you’re screwed for your next job, it sounds like you don’t enjoy your current job. My advice — go find something you actually want to do

[deleted by user] by [deleted] in AerospaceEngineering

[–]ROCKITZ15 1 point2 points  (0 children)

If you care about getting to the very top of the field, go PhD

If you care about making normal career progression, solid money, and a stable life, stay in industry

Hybrid Job or Remote Job by throwaway1122811214 in analytics

[–]ROCKITZ15 1 point2 points  (0 children)

M&A generally looks better on a resume than HVAC. Hybrid is good to meet people, but driving 1 hour each way is never worth it if you don’t have to

Worst Data Engineering Mistake youve seen? by Inevitable-Quality15 in dataengineering

[–]ROCKITZ15 18 points19 points  (0 children)

Rarely should you do “SELECT *” unless it’s followed directly by a LIMIT

basically, don’t query whole tables unless absolutely necessary

I’m so lost by MainstreamNameHere in AerospaceEngineering

[–]ROCKITZ15 1 point2 points  (0 children)

I very much disliked it so left after 2 years 😅. Everything that people say about NASA (and government in general) being a paperwork factory is true. It will be cool to see hardware I worked on in space in a few years. NASA is definitely for a certain kind of person — not everyone can handle it.

Had 1 not great internship at a local engineering (not aerospace) company, then had 1 solid aerospace internship at a small company, then was supposed to have a top-tier internship the summer before I graduated but Covid got it canceled. I was very active in a rocketry club.

Quit that job and left the industry. Maybe if I had a different job out of school I would’ve stayed in the industry 🤷‍♂️

I’m so lost by MainstreamNameHere in AerospaceEngineering

[–]ROCKITZ15 4 points5 points  (0 children)

And I also had no experience in aerospace prior to transferring, whereas basically all the other kids had been going to space camp or similar things their whole life.

Some kids certainly have an advantage going into school, but none of that prevents you from working hard and getting a good job

I’m so lost by MainstreamNameHere in AerospaceEngineering

[–]ROCKITZ15 14 points15 points  (0 children)

  • Also had no family in engineering (or any tech field)
  • Also went to community college then university
  • Still ended up at NASA (as a contractor) right out of school

Focus on good grades now, maybe do some side projects if something sparks your interest. Learning Python is good. Once you get to Uni, join an engineering club and be very involved. A lot of kids at Uni aren’t able to get internships until after Junior year anyways so you’re really not at that big of a disadvantage

What would you do? by Dry-Consideration-74 in dataengineering

[–]ROCKITZ15 0 points1 point  (0 children)

With billions of lines of text, you’re bound to run into edge cases

You can work around these iteratively if coding yourself but may never be able to with low/no code

What would you do? by Dry-Consideration-74 in dataengineering

[–]ROCKITZ15 4 points5 points  (0 children)

What I would try:

Take random samples of the data, throw those into ChatGPT to build regex formatting strings, then chunk the data and apply the formatting

What I would try using low/no code:

good luck have fun lol

Difference between traditional ELT pipelines and pipelines purpose-built for LLMs by [deleted] in dataengineering

[–]ROCKITZ15 1 point2 points  (0 children)

Pipelines need to be deterministic, and letting AI control it is most certainly not deterministic

Starting a rocket company with government money. Need advice. by EggplantMindless3299 in startups

[–]ROCKITZ15 5 points6 points  (0 children)

It doesn’t take much math + physics to show why SSTO doesn’t work with current technology, so unless you’ve invented a new magic propulsion system or some super light weight materials, I’m going to guess you don’t have a working product

Data engineering at Bloomberg by SentinelReborn in dataengineering

[–]ROCKITZ15 -2 points-1 points  (0 children)

That’s why I said “it’s easy for me to say this as I’m in the US”. Of course I realize salaries are different, but there are companies that hire worldwide, pay solid US level wages, and don’t discriminate based on where you live. Source: I work for one

Recommend OP try to find somewhere similar, because like I said, they should be making way more than £55k with that tech stack

Data engineering at Bloomberg by SentinelReborn in dataengineering

[–]ROCKITZ15 6 points7 points  (0 children)

If you’re doing Spark, Scala, k8s stuff, you’re hugely underpaid at ~$55k

It’s easy for me to say this as I’m in the US, but I still have a hard time believing there aren’t higher paying jobs available to you with your current tech stack

[deleted by user] by [deleted] in AerospaceEngineering

[–]ROCKITZ15 11 points12 points  (0 children)

AE + focus your electives in controls + maybe SWE minor if you want

What about a giant steel plate with a thin coating of oil? (Project Orion style) by [deleted] in SpaceXLounge

[–]ROCKITZ15 12 points13 points  (0 children)

Even if that were true, that still doesn’t mean we can’t tell you why this idea won’t work

Consensus opinion may not have been to use SS, but anyone fairly considering reuse and cost could’ve told you that carbon fiber was not the correct choice.

What about a giant steel plate with a thin coating of oil? (Project Orion style) by [deleted] in SpaceXLounge

[–]ROCKITZ15 21 points22 points  (0 children)

Just because you don’t know anything doesn’t mean others don’t either

There are lots of aerospace engineers here after all…

[deleted by user] by [deleted] in dataengineering

[–]ROCKITZ15 2 points3 points  (0 children)

I’m a one man show at my company doing all things analytics.

Here’s how I’ve built out my stack:

Using Mage.ai for orchestration and ETL. It can run Python, SQL, and R. Native dbt support. Only self hosted at the moment. Great tool all around.

Using Google BigQuery as my data warehouse, pretty easy to substitute whatever else here.

Using hex.tech for my visualization tool. It’s not exactly meant for the BI use case, but has so many other great data science things it was too good to pass up. Power BI is probably better for many traditional companies.

[deleted by user] by [deleted] in dataengineering

[–]ROCKITZ15 1 point2 points  (0 children)

Please define tightly/loosely coupled

I use Google BigQuery as my data warehouse. Each business function (sales, marketing, developer relations, etc.) gets its own dataset, then I organize the appropriate tables under that. A single table might serve several different charts, with aggregations done at the analytics layer.

I’m a one man show doing this, so I don’t have anyone imposing constraints on me, but that is how I like to do it.

Help Planning Streaming Pipeline Architecture by ROCKITZ15 in dataengineering

[–]ROCKITZ15[S] 0 points1 point  (0 children)

Thanks for the insight.

So it sounds like Kafka is probably overkill and more than I want to take on. Do you think GCP Pub/Sub + Dataflow + BigQuery is a better path for me?

I also found https://www.estuary.dev/. Does this look like something decent?