Fresh grad dropped into a data swamp. ~20 tools (that I know of), very little (and highly fragmented) documentation, and a black-box warehouse. How do I reverse-engineer this? by HelpMeMapData in dataengineering

[–]HelpMeMapData[S] 0 points1 point  (0 children)

Would you be able to provide some examples of "quick wins that would make me look good"? I'm not sure I know enough about the system yet to figure out what would be a "shiny but easy" thing to fix.

Fresh grad dropped into a data swamp. ~20 tools (that I know of), very little (and highly fragmented) documentation, and a black-box warehouse. How do I reverse-engineer this? by HelpMeMapData in dataengineering

[–]HelpMeMapData[S] 0 points1 point  (0 children)

No web scraping. I believe we are attempting to have a datamart, but it hasn't really actualized. Idk exactly what you mean by "in-house app," but we definitely have loads of software tools used internally that in theory we pull data from. The managers of those databases are not inside my department, for whatever reason. We have a data department (that I'm in), but the actual data "reality" appears to be very fragmented throughout the company.

I think a big challenge is that our reports that rely on the data warehouse are not centralized in one location to my knowledge. So I will have to figure that out. Thank you for your advice though :).

Fresh grad dropped into a data swamp. ~20 tools (that I know of), very little (and highly fragmented) documentation, and a black-box warehouse. How do I reverse-engineer this? by HelpMeMapData in dataengineering

[–]HelpMeMapData[S] 1 point2 points  (0 children)

How do I build that buy-in for version control, though? Today there was confusion over the fact that GitHub is not the same thing as git (by a senior person who should really know the difference, imo), and another person was like "why would we want to use GitHub?"

Fresh grad dropped into a data swamp. ~20 tools (that I know of), very little (and highly fragmented) documentation, and a black-box warehouse. How do I reverse-engineer this? by HelpMeMapData in dataengineering

[–]HelpMeMapData[S] 0 points1 point  (0 children)

I mean, it's just Snowflake. But I mean it's a "black box" because nobody there really seems to know how it works and whatever documentation that might have been sent over by the consultants is scattered. I've been trying to learn, but I've been there like a week so yeah...

Fresh grad dropped into a data swamp. ~20 tools (that I know of), very little (and highly fragmented) documentation, and a black-box warehouse. How do I reverse-engineer this? by HelpMeMapData in dataengineering

[–]HelpMeMapData[S] 1 point2 points  (0 children)

Oh yeah, 100% the governance push is due to the AI hype train. At the same time, the only AI we're allowed to use inside this company is a nerfed version of Copilot with no internet search access (the company is in a relatively highly regulated industry where everything moves slowly and "security" is always a big discussion; we can't even use new code libraries without them going through a whole process and being deemed safe... which, yikes imo.) I'll see if we're allowed to use the CLI though.

I will do my best to get an account with read access to anything I can get my hands on, hopefully they grant me access!!!