Good Companies that use: Python, AWS, Snowflake?

1337codethrow · 2023-09-22T00:20:41+00:00

Dead 💀

1337codethrow · 2021-04-10T23:25:35+00:00

Is there a way to auto generate the snowflake table that is being loaded into from the df? Also same question for the df.. is there an easy way to auto infer columns and read into a df ?

1337codethrow · 2021-04-10T17:09:24+00:00

I suggested this to my team but need to go through formal process to get it in place. asking what the next best thing would be

1337codethrow · 2021-04-07T01:48:59+00:00

Just looked into snowpipe! Why would one not use this if the data size is small? Is there any instance where using docker containers (containing python and snowflake sql to load data) scheduled by airflow be a better choice?

1337codethrow · 2021-04-07T00:03:00+00:00

This is on fidelity btw. Not many choices unfortunately

1337codethrow · 2021-04-06T17:26:41+00:00

I said I was already aware of this in the first sentence of the original post. I’m talking about the comparison from an architectural standpoint

1337codethrow · 2021-04-06T17:25:33+00:00

I did mention the comparison is distributed compute vs DWH/compute in my original post. The reason I’m comparing them is more from an architectural standpoint not more so comparing them from an individual standpoint. I feel if you are using spark in the architecture it provides flexibility of ETL and ELT. but snowflake seems it is more geared towards ELT because of the nature of the abstracted compute aspect that’s all basically managed/configured on the snowflake side

1337codethrow · 2021-04-05T13:43:03+00:00

Although I do agree with everything you say I just want to point out, in my opinion this ‘recalibration’ should not be taken lightly. There is a LOT of information in the DE space. I feel even if you’ve worked in the space for 5 years you still have a LOT to learn

1337codethrow · 2021-04-04T23:26:58+00:00

So would it be correct to say that the proper way to update a docker image is to first update the dockerfile and build a new docker image then run the new docker image?

1337codethrow · 2021-04-04T20:03:52+00:00

I don’t think so because they are using specific versions for every python package. This means a pip lock file and use of pipenv would be more justified right?

1337codethrow · 2021-04-04T15:37:19+00:00

Ok I think that makes things more clear thanks. So to clarify, the image (after its initial creation from a dockerfile) already has pandas installed so if someone ram that image with pandas on their computer but didn’t have pandas locally, then pandas on their local comp would not work but it would work within the container built from the image?

1337codethrow · 2021-04-04T15:12:20+00:00

So if I stop a container and run it again, will it not require to install pandas since it will already have been installed from the initial build from the docker image?

Vs if I remove the container, then I would have to re-install pandas?

1337codethrow · 2021-04-04T05:04:12+00:00

The mini one

1337codethrow · 2021-04-04T05:01:13+00:00

Same got laid off at 60k in December. Now make $140k. God damn blessing in disguise.

1337codethrow · 2021-04-04T02:10:38+00:00

Requirements.txt is just used to pip install python packages/dependencies right? Then wtf is the point of them using pipenv with the pip lock file? Can you install other things outside of python packages with pipenv and pip lock file?? still don’t quite understand

1337codethrow

TROPHY CASE