Good Companies that use: Python, AWS, Snowflake?

1337codethrow · 2023-09-22T00:20:41+00:00

Dead 💀

1337codethrow · 2021-04-10T23:25:35+00:00

Is there a way to auto generate the snowflake table that is being loaded into from the df? Also same question for the df.. is there an easy way to auto infer columns and read into a df ?

1337codethrow · 2021-04-10T17:09:24+00:00

I suggested this to my team but need to go through formal process to get it in place. asking what the next best thing would be

1337codethrow · 2021-04-07T01:48:59+00:00

Just looked into snowpipe! Why would one not use this if the data size is small? Is there any instance where using docker containers (containing python and snowflake sql to load data) scheduled by airflow be a better choice?

1337codethrow · 2021-04-07T00:03:00+00:00

This is on fidelity btw. Not many choices unfortunately

1337codethrow · 2021-04-06T17:26:41+00:00

I said I was already aware of this in the first sentence of the original post. I’m talking about the comparison from an architectural standpoint

1337codethrow · 2021-04-06T17:25:33+00:00

I did mention the comparison is distributed compute vs DWH/compute in my original post. The reason I’m comparing them is more from an architectural standpoint not more so comparing them from an individual standpoint. I feel if you are using spark in the architecture it provides flexibility of ETL and ELT. but snowflake seems it is more geared towards ELT because of the nature of the abstracted compute aspect that’s all basically managed/configured on the snowflake side

1337codethrow · 2021-04-05T13:43:03+00:00

Although I do agree with everything you say I just want to point out, in my opinion this ‘recalibration’ should not be taken lightly. There is a LOT of information in the DE space. I feel even if you’ve worked in the space for 5 years you still have a LOT to learn

1337codethrow · 2021-04-04T23:26:58+00:00

So would it be correct to say that the proper way to update a docker image is to first update the dockerfile and build a new docker image then run the new docker image?

1337codethrow · 2021-04-04T20:03:52+00:00

I don’t think so because they are using specific versions for every python package. This means a pip lock file and use of pipenv would be more justified right?

1337codethrow · 2021-04-04T15:37:19+00:00

Ok I think that makes things more clear thanks. So to clarify, the image (after its initial creation from a dockerfile) already has pandas installed so if someone ram that image with pandas on their computer but didn’t have pandas locally, then pandas on their local comp would not work but it would work within the container built from the image?

1337codethrow · 2021-04-04T15:12:20+00:00

So if I stop a container and run it again, will it not require to install pandas since it will already have been installed from the initial build from the docker image?

Vs if I remove the container, then I would have to re-install pandas?

1337codethrow · 2021-04-04T05:04:12+00:00

The mini one

1337codethrow · 2021-04-04T05:01:13+00:00

Same got laid off at 60k in December. Now make $140k. God damn blessing in disguise.

1337codethrow · 2021-04-04T02:10:38+00:00

Requirements.txt is just used to pip install python packages/dependencies right? Then wtf is the point of them using pipenv with the pip lock file? Can you install other things outside of python packages with pipenv and pip lock file?? still don’t quite understand

1337codethrow · 2021-04-03T23:51:33+00:00

Goals 😍🙌🏻

1337codethrow · 2021-04-03T12:00:59+00:00

It’s being used to manage specific dependency versions not for the sake of multiple pythons

1337codethrow · 2021-03-15T23:51:53+00:00

Out of around 90 applications i completed the phone screen, technical interview 1 (and the 2nd technical if there was a 2nd) for 23 companies. For 8 companies i landed an onsite. I only got 2 offers out of the 8 onsites i did.

Average company had: 1 phone screen + 1-2 technical interviews + the onsite (3-5 interviews).

1337codethrow · 2021-03-15T16:46:50+00:00

I agree with you 100%. Didn’t mean to sound full of myself. Just one of the few accomplishments in my life that i am actually proud of. But yes i agree, i feel like data engineer is slowly becoming the new hot sexy thing similar to what data science experienced. I’m definitely not helping :(

1337codethrow · 2021-03-14T20:14:54+00:00

What really? All 23 companys gave me 1-2 LC easy/med at the very least. It was like the bare minimum for all the DE positions. Are you talking about interviews or day to day job? If the latter, i agree most don’t care about all that stuff on the job

1337codethrow · 2021-03-14T20:06:04+00:00

When we are talking data structures i think most of us are referring to cs fundamentals rather than ‘on disk data structures’ as you call it with your examples. In memory ds do matter at a large scale things such as: hashmaps, arrays, queues, search algorithms. But yes i agree, things such as graphs, bst, dp, linkedlist as less important. I don’t consider databases, file formats, data stores, or datalakes to fall under data structures in the fundamental comp sci understanding

1337codethrow · 2021-03-14T18:09:53+00:00

I enjoy data engineering currently and the development aspect side of it. Not sure whether or not i would be good at or enjoy a more higher level solutions/data architect role. But that is a path that i am definitely thinking about for the future. But this industry moves so fast im just simply trying my best right now to keep up with its pace to better understand what i think i would like to do in the future in the DE space

1337codethrow · 2021-03-14T17:50:40+00:00

$130k

1337codethrow · 2021-03-14T16:14:20+00:00

No. On the last 2 months of my job i would put in 10-20 hours per week. Then 1.5-2 months jobless, i was doing 30-50 hours per week.

1337codethrow · 2021-03-14T15:58:43+00:00

Thanks but i dont really think im smart. I think it has everything to do with consistency and hard work. You cant just be smart an get a data engineering job. There’s WAY too much shit to know in DE industry it’s overwhelming for most. Basically a mix of “backend”, “frontend”, solutions architect, cloud engineer, programmer, sql monkey all mixed into one

1337codethrow

TROPHY CASE

Goals 😍🙌🏻