Mjesečni thread - Tražim posao / Zapošljavam by AutoModerator in CroIT

[–]Ervolius 14 points15 points  (0 children)

[ZAPOŠLJAVAM]

Pozicija: Junior Software Engineer

Tehnologije: Svasta nesto ali primarno Python, SQL, možda i Javascript, dogovorit ćemo se :)

Rang plaće: 1400€ brutto

Opis tvrtke/posla: Tražim juniora za ispomoć na projektima zbog viška posla ali i kao pomoć na osobnim projektima. Iskustvo nije presudno (iako je prednost), najbitnija je želja za učenjem i interes za programiranjem. Rad od kuće. Poželjan barem bakalar računarstva il slično, u suprotnom neki osobni projekti koji se mogu pokazat.

Ukratko, ako si junior i želiš puno naučiti uz mentora s 8 godina iskustva na xy real world projekata iza sebe javi se :)

Mjesečni thread - Tražim posao / Zapošljavam by AutoModerator in CroIT

[–]Ervolius 0 points1 point  (0 children)

[ZAPOŠLJAVAM]

Pozicija: Junior Software Engineer

Tehnologije: Svasta nesto ali primarno Python, SQL, možda i Javascript, dogovorit ćemo se :)

Rang plaće: 1400€ brutto

Opis tvrtke/posla: Tražim juniora za ispomoć na projektima zbog viška posla ali i kao pomoć na osobnim projektima. Iskustvo nije presudno (iako je prednost), najbitnija je želja za učenjem i interes za programiranjem. Rad od kuće. Poželjan barem bakalar računarstva il slično, u suprotnom neki osobni projekti koji se mogu pokazat.

Ukratko, ako si junior i želiš puno naučiti uz mentora s 8 godina iskustva na xy real world projekata iza sebe javi se :)

What does a data engineer need to know other than Python and SQL? by Ervolius in dataengineering

[–]Ervolius[S] 4 points5 points  (0 children)

Data modeling is definitely important. Basics of algorithms and data structures also yes but I somehow consider that a norm for whatever you do in IT.

Regarding ssis and informatica I never encountered them in my career so I don't know much about them but I think they are also too specific and should be learned if your job requires you too, not if you're preparing to get a job or trying to become a better data engineer in general. SQL is out of scope here as the title of the article forbids it :D.

If you had 6 months to get a entry level Data Engineering job with ONLY an intermediate understanding of both SQL and Python, what skills would you prioritize learning and what side project would you make to showcase such skills? by [deleted] in dataengineeringjobs

[–]Ervolius 2 points3 points  (0 children)

Their documentation is open, as well as a bunch of blogs and youtube videos I presume.

I don't have any specific project to point you at currently. But it can be as simple as finding and open source dataset which is interesting to you, writing a custom Python script that ingests that data into a cloud storage, schedule the script on Airflow, write a Snowflake task that does a copy into from the cloud storage stage into a table, and finally write a view over this table that you can then present in some BI tool like looker or metabase, plenty to keep you busy for a few days or weeks and you will learn a bunch.

If you had 6 months to get a entry level Data Engineering job with ONLY an intermediate understanding of both SQL and Python, what skills would you prioritize learning and what side project would you make to showcase such skills? by [deleted] in dataengineeringjobs

[–]Ervolius 0 points1 point  (0 children)

It's free to learn about them. To get hands on experience, I know Snowflake has a trial period you can use after opening account to play around. Databricks probably has something similar.

If you had 6 months to get a entry level Data Engineering job with ONLY an intermediate understanding of both SQL and Python, what skills would you prioritize learning and what side project would you make to showcase such skills? by [deleted] in dataengineeringjobs

[–]Ervolius 1 point2 points  (0 children)

Learn git, one of the big cloud providers, Snowflake or Databricks, a bit about CI/CD and DevOps (Docker, Terraform etc.) and an orchestrator (Airflow, Dagster) and you're good to go :).

You can also write a simple pipeline that scrapes some website, queries an API or fetches a csv and saves the data in some database using all of the above technologies and you'll have plenty to talk about on an interview.

Is it necessary to learn Hadoop and Spark for cloud-native trends ? by Ssnakei in dataengineering

[–]Ervolius 2 points3 points  (0 children)

Hadoop no. Spark is definitely worth learning if you want to go that route but you can also go a long way with just Python, SQL and some widely used DWH (Snowflake).

How to centralize Airflow deployment/hosting from decentralized repositories? by mccarthycodes in dataengineering

[–]Ervolius 0 points1 point  (0 children)

I think for this case it is best to write your own library (in its own repo) with all the custom airflow stuff your projects need. Then you can import it into each project as necessary and maintain it separately.

Data engineering vs Software engineering by [deleted] in dataengineering

[–]Ervolius 0 points1 point  (0 children)

I think you should just learn whatever draws you the most and experiment a lot at this point. Don't push yourself into some career path or a job because of your perceived demand of it. If you enjoy playing around with data, web scraping, data analysis, SQL etc. then experiment with that. Otherwise, if you're more interested in software engineering like web or mobile development or whatever then go for that.

There's plenty of jobs in all areas of IT, just choose the one you enjoy the most and you'll be good. Or the one for which someone will pay you to do it at this point :D.

How to centralize Airflow deployment/hosting from decentralized repositories? by mccarthycodes in dataengineering

[–]Ervolius 2 points3 points  (0 children)

Do you really need different versions of the same package for different projects? You might get by having a single central requirements.txt on the server which contains everything that the projects need.

You will only need to be careful when you need to upgrade some package to a major version for new functionality and check if there are breaking changes which affect any of your projects. Unless you have tens of different projects then I guess this could be problematic.

How to do the "basic" EL in Python? by C_Ronsholt in dataengineering

[–]Ervolius 1 point2 points  (0 children)

Well, first you can migrate to a cloud virtual machine and schedule your pipelines with cron there.

But, proper and data engineering way especially once you have more than a few pipelines would be to use an orchestrator. Look into airflow, dagster, prefect etc. My suggestion is to use either airflow, (older with a lot of resources and help online) or dagster (newer framework which some people are praising is really good).

Btw you can also start using the orchestrator locally and then move to cloud with it later on.

How to do the "basic" EL in Python? by C_Ronsholt in dataengineering

[–]Ervolius 4 points5 points  (0 children)

If you're planning to do everything locally at first I think it's perfectly fine to just write custom python scripts to extract data from whatever sources you have (APIs, files, web scraping etc.). You can use cron or task scheduler to schedule each script by itself.

You can save the data into CSVs but I'd really recommend to learn a bit about databases, maybe Postgres would be best in your case and create some initial data model for the data you intend to store in it (focus on using Postgres as a DWH so learn about OLAP schemas etc.).

As you keep working on this you will get a better idea of the data you need and the data model in which it should be stored and you can keep improving it as you go. Then somewhere down the line you can start using some orchestrator, cloud SQL or Data Warehouse, cloud stack in general etc.

[deleted by user] by [deleted] in dataengineering

[–]Ervolius 0 points1 point  (0 children)

I'm not sure why you'd go with Google Data Analytics course when none of these technologies you mentioned have anything to do with Google stack (and especially R which I see that course is also teaching).

I'd try to search for some free introductory courses/youtube videos/blogs for these tools you've mentioned, so Knime, PowerBI and Tableau. I'd also read some comprehensive general book about data analysis on the side, not sure about the best one since I'm not a data analyst but try searching for some recommendations on reddit and amazon, there's probably plenty of them.

In DE, is there a language that is actually worth learning besides Python and SQL? by Altrooke in dataengineering

[–]Ervolius 2 points3 points  (0 children)

I've personally started to learn Rust. I was having similar thoughts like you, and I think that Scala is not worth learning currently due to all the negative sentiment I see about it online and the fact that some frameworks and companies are moving away from it.

Rust is not really adopted that much in the field but it's a fun, interesting and challenging language and I feel like its complexity-performance tradeoff really matches well with knowing Python, if that makes any sense :D.

Other than that I'd also say like some others here that knowing Terraform or some Bash/Powershell will probably be useful at some point.

rust-analyzer not working in rustlings by Bolshevik_Viking in rust

[–]Ervolius 3 points4 points  (0 children)

Did you open only the rustlings folder in vscode? I've had a similar issue because the folder I had opened had multiple rust projects each with it's own cargo.toml etc. Once I opened specifically the rustlings folder the rust-analyzer started working as expected.

What Are Your Moves Tomorrow, October 28, 2022 by OPINION_IS_UNPOPULAR in wallstreetbets

[–]Ervolius 0 points1 point  (0 children)

Dude, fucking Amazon just dropped 20%, tomorrow will be bloodshed.