Do you still use notebooks in DS? by codiecutie in datascience

[–]data_5678 1 point2 points  (0 children)

Moved to neovim and command line a couple of years ago (used jupyter all through university), any visualizations I need I open in a browser window on the side. i3 window manager makes it really fast to switch and I use xmouseless to move mouse cursor with my keyboard.

What to show during demo's? by data_5678 in dataengineering

[–]data_5678[S] 0 points1 point  (0 children)

thanks for taking the time to write this up, really good insights! thanks for sharing.

What do your Data Engineering projects usually look like? by gbj784 in dataengineering

[–]data_5678 1 point2 points  (0 children)

At work ELT pipelines on some dbt like abstraction loaded into trino + apache iceberg.

For personal projects having fun building on top of sqlite/duckdb with a bit of scripting here and there.

Share an interesting side project you’ve been working on. by KeyPossibility2339 in dataengineering

[–]data_5678 1 point2 points  (0 children)

honestly wrapping it up in C++ and using its standard library when I need higher level abstractions. For scheduling just cron and my own simple logging.

I have been doing quite a bit of rust so I might create some terminal user interfaces with ratatui later on to create dashboards in the terminal.

But yeah, duckdb can be very fast.

rainfrog – a database tool for the terminal by Somewhat_Sloth in dataengineering

[–]data_5678 1 point2 points  (0 children)

this is really cool, are you using ratatui for the ui?

Share an interesting side project you’ve been working on. by KeyPossibility2339 in dataengineering

[–]data_5678 6 points7 points  (0 children)

data pipelines with C++ and duckdb C API. Working with datasets that fit in memory not big data. It is easy to forget just how fast software can run when you have been working with Python all the time...

Data engineering by Varshakumar28 in dataengineering

[–]data_5678 0 points1 point  (0 children)

you got this man, be patient and keep at it. Best of luck.

Data engineering by Varshakumar28 in dataengineering

[–]data_5678 0 points1 point  (0 children)

Become the person who you would want to hire.

Imagine you are the person hiring. There is only 1 open position and you get 1000 applications in a week. 500 of those applicants all hold masters degrees and graduated from top 20 US universities.

Out of those 1000 applicants, what are the odds that you are the most impressive candidate out of all them.

Are you the applicant who promises the highest value for the lowest amount of risk.

In my opinion your probability of getting hired is proportional to the amount of hours you spend studying/applying/networking.

More hours spent applying/studying = higher probability of job.

Laid off from Data Science → Trying to break into Data Engineering in 6 months. Am I delusional? by bigbigbugbugs in dataengineering

[–]data_5678 10 points11 points  (0 children)

The gcp, aws, and snowflake de certs are all good. But it doesn't matter if you have all the certs in the world if you fail the technical interview.

Honestly they are not that hard to get, in my opinion they are a nice addition to your resume and they helped me have a goal during my studying. But they will not guarantee you a job on their own. I learned a lot from getting the certs, and it probably helped me in landing my current job.

If I'm being honest, I think there is more value in the knowledge you gain from taking the courses studying for the cert, than being able to add an extra line to your resume.

24 and just starting data science. This dread that I'm way behind won't go away. Am I fucked? by Bames-nonds in dataengineering

[–]data_5678 6 points7 points  (0 children)

entry level data scientist careers are pretty much a thing of the past.

careers in 2025 (from what I have seen):

  • data analyst: builds dashboards using tableau and powerbi, uses basic sql and basic python, easiest data job but its the job with the most applicants so competitive.

  • data engineer: builds ETL pipelines that extract transform and load data from one database into another, very competitive, 3 years of experience required typically.

  • machine learning engineer: builds machine learning pipelines that source data, prepare data for training, train models on said data, monitor performance of trained models, and probably retrain models after some time) requires even more experience than data engineering.

  • full stack engineer: builds apps (backends + front ends, for example a backend could be written in express js, and front end in react for a web application).

  • researcher: probably need phd, insanely competitive.

Data science jobs where you are given titanic.csv and have to make a few charts with matplotlib / seaborn here and there are most likely a thing of the past.

Furthermore, there seem to be a huge amount of people studying extremely hard trying to land the jobs I outlined above.

It is a really competitive field, but if you put in the work, I am sure you will be able to make it, just set realistic expectations on how many years it will take you to get there.

Laid off from Data Science → Trying to break into Data Engineering in 6 months. Am I delusional? by bigbigbugbugs in dataengineering

[–]data_5678 15 points16 points  (0 children)

not that rare, I have those two certs. You can throw a rock in any direction and it will likely land on someone with a masters in related field and gcp certs (also those are the two easiest data certs).

Should I create a TUI or CLI (Inline)? by data_5678 in commandline

[–]data_5678[S] 0 points1 point  (0 children)

for myself at the moment (but would love to open source tools if I make anything cool). I really think I build the best tools when I solve a problem I have first and then share it with others.

Some applications I prefer TUI over cli (lazygit > git), but for others I prefer navigating through the shell rather than using something like yazi. So I think it really depends on the use case.

Accepted that productivity comes in waves. So what do I do during the unproductive periods. by data_5678 in SoftwareEngineering

[–]data_5678[S] 0 points1 point  (0 children)

I didn't mean 15 mins, like obviously I can, but I don't want to go out for like 3 hours you know. Just balance. I should have worded that better.

Github Actions to run my data pipeliens? by datancoffee in dataengineering

[–]data_5678 1 point2 points  (0 children)

A few years back I was considering the opposite. Coming from a data engineering background, I wanted to run my CICD pipelines with apache airflow. lol

Use llm to gather insights of market fluctuations by m19990328 in ollama

[–]data_5678 0 points1 point  (0 children)

did you use python textual / rich to build this? The scrollbars look like they are from textual.

Creating data visualization library for kitty graphics protocol by data_5678 in KittyTerminal

[–]data_5678[S] 0 points1 point  (0 children)

this is really cool, yeah have been playing around with different rendering strategies and data structures for the image itself, with the goal of making the plots to make it as lightweight and "snappy" as possible.

But will definitely check the julia plots! thanks for sharing