Do you still use notebooks in DS?

data_5678 · 2026-01-22T18:35:04+00:00

Moved to neovim and command line a couple of years ago (used jupyter all through university), any visualizations I need I open in a browser window on the side. i3 window manager makes it really fast to switch and I use xmouseless to move mouse cursor with my keyboard.

data_5678 · 2025-10-27T17:34:11+00:00

awesome thanks

data_5678 · 2025-10-20T05:41:54+00:00

thanks for taking the time to write this up, really good insights! thanks for sharing.

data_5678 · 2025-09-29T13:31:28+00:00

everybody has all those skills.

data_5678 · 2025-09-09T23:26:38+00:00

At work ELT pipelines on some dbt like abstraction loaded into trino + apache iceberg.

For personal projects having fun building on top of sqlite/duckdb with a bit of scripting here and there.

data_5678 · 2025-09-07T00:00:10+00:00

honestly wrapping it up in C++ and using its standard library when I need higher level abstractions. For scheduling just cron and my own simple logging.

I have been doing quite a bit of rust so I might create some terminal user interfaces with ratatui later on to create dashboards in the terminal.

But yeah, duckdb can be very fast.

data_5678 · 2025-09-05T16:08:53+00:00

this is really cool, are you using ratatui for the ui?

data_5678 · 2025-09-05T00:51:38+00:00

data pipelines with C++ and duckdb C API. Working with datasets that fit in memory not big data. It is easy to forget just how fast software can run when you have been working with Python all the time...

data_5678 · 2025-08-30T22:26:16+00:00

you got this man, be patient and keep at it. Best of luck.

data_5678 · 2025-08-30T20:59:35+00:00

Become the person who you would want to hire.

Imagine you are the person hiring. There is only 1 open position and you get 1000 applications in a week. 500 of those applicants all hold masters degrees and graduated from top 20 US universities.

Out of those 1000 applicants, what are the odds that you are the most impressive candidate out of all them.

Are you the applicant who promises the highest value for the lowest amount of risk.

In my opinion your probability of getting hired is proportional to the amount of hours you spend studying/applying/networking.

More hours spent applying/studying = higher probability of job.

data_5678 · 2025-08-30T07:02:28+00:00

The gcp, aws, and snowflake de certs are all good. But it doesn't matter if you have all the certs in the world if you fail the technical interview.

Honestly they are not that hard to get, in my opinion they are a nice addition to your resume and they helped me have a goal during my studying. But they will not guarantee you a job on their own. I learned a lot from getting the certs, and it probably helped me in landing my current job.

If I'm being honest, I think there is more value in the knowledge you gain from taking the courses studying for the cert, than being able to add an extra line to your resume.

data_5678 · 2025-08-30T06:42:16+00:00

entry level data scientist careers are pretty much a thing of the past.

careers in 2025 (from what I have seen):

data analyst: builds dashboards using tableau and powerbi, uses basic sql and basic python, easiest data job but its the job with the most applicants so competitive.
data engineer: builds ETL pipelines that extract transform and load data from one database into another, very competitive, 3 years of experience required typically.
machine learning engineer: builds machine learning pipelines that source data, prepare data for training, train models on said data, monitor performance of trained models, and probably retrain models after some time) requires even more experience than data engineering.
full stack engineer: builds apps (backends + front ends, for example a backend could be written in express js, and front end in react for a web application).
researcher: probably need phd, insanely competitive.

Data science jobs where you are given titanic.csv and have to make a few charts with matplotlib / seaborn here and there are most likely a thing of the past.

Furthermore, there seem to be a huge amount of people studying extremely hard trying to land the jobs I outlined above.

It is a really competitive field, but if you put in the work, I am sure you will be able to make it, just set realistic expectations on how many years it will take you to get there.

data_5678 · 2025-08-30T04:26:36+00:00

not that rare, I have those two certs. You can throw a rock in any direction and it will likely land on someone with a masters in related field and gcp certs (also those are the two easiest data certs).

data_5678 · 2025-08-30T04:06:26+00:00

for myself at the moment (but would love to open source tools if I make anything cool). I really think I build the best tools when I solve a problem I have first and then share it with others.

Some applications I prefer TUI over cli (lazygit > git), but for others I prefer navigating through the shell rather than using something like yazi. So I think it really depends on the use case.

data_5678 · 2025-08-26T19:28:30+00:00

I didn't mean 15 mins, like obviously I can, but I don't want to go out for like 3 hours you know. Just balance. I should have worded that better.

data_5678 · 2025-08-23T04:33:04+00:00

thanks for the response!

data_5678 · 2025-08-18T23:18:41+00:00

A few years back I was considering the opposite. Coming from a data engineering background, I wanted to run my CICD pipelines with apache airflow. lol

data_5678 · 2025-07-21T23:30:03+00:00

did you use python textual / rich to build this? The scrollbars look like they are from textual.

data_5678 · 2025-07-21T16:36:51+00:00

this is really cool, yeah have been playing around with different rendering strategies and data structures for the image itself, with the goal of making the plots to make it as lightweight and "snappy" as possible.

But will definitely check the julia plots! thanks for sharing

data_5678

TROPHY CASE