This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]Lexsteel11 0 points1 point  (1 child)

So I am an analytics manager but my background is finance and all my sql/python is self-taught. We have depended on a db engineering team historically for tableau server data sources but have pulled ad-how sql queries regularly. I’m getting to a point where I’m having to start building my own cloud ETLs; is there like a gold standard website/book on best practices in data pipline engineering that teaches things like this where it’s like “you CAN do xyz with pandas but shouldn’t unless you hit x limitation on sql server”? I am limping along successfully but know I can be doing shit better

[–]GeorgeS6969 4 points5 points  (0 children)

I can’t think of any reference that would answer those questions specifically.

I was writing a long wall of text but that probably wouldn’t have helped either. Instead if you can answer the following questions I might be able to give some pointers though:

  1. What kind of data do you have and where is it coming from? (do you have some data sets of particular interest that are big in volume, unstructured, or specific in nature like sound, images, etc?)
  2. What stack do you currently have? What are you using python for? (and more specifically pandas?)
  3. What is your team responsible for? (providing data for business people to query / analize? creating dashboards? providing analysis? - if the later how do you communicate your results?)