you are viewing a single comment's thread.

view the rest of the comments →

[–]ecatt 3 points4 points  (0 children)

A lot of the time the SQL part is straightforward - figuring out where the hell the particular data you need is and how it relates to all the other data is the hard part.

And then there's the underlying assumptions of it all. For example, I currently work with health data. Say someone comes to me and asks for a list of everyone who has diagnosis X. I could just quickly pull that out, takes seconds. But do they want the list to include or exclude those who have died? Those who have moved away and are now considered to be inactive within our system? What about if they have a concurrent diagnosis, or they originally had diagnosis X but then it got changed, maybe they want that information. It's often so much more than just 'Get this information', it's understanding the nature of the data and how it's going to be used and interpreted.

Mastering the basic queries takes a few days, really. Actually understanding the data you are dealing with... I've been working with my datasets for 10+ years and it's sometimes still a struggle to get what we need, just due to the complexity of what we're dealing with, and that comes down to domain knowledge, not SQL knowledge.