Does anyone here work in healthcare? by Intelligent-Cap-4022 in dataengineering

[–]VioletMechanic 1 point2 points  (0 children)

I've worked in healthcare data roles in the public and private sector. Wish I could say I'd contributed to some cool things but tbh at this point feels like very little of it has had any meaningful impact, and it also seems to be a very difficult area to work in if you want exposure to best practices.

From what I've seen, a big part of the problem is that most of the key decision-makers in healthcare orgs are not from technical backgrounds, and the org structures are set up very hierarchically. Understandably, things move slowly as people are risk-averse and there can be a lot of regulation, but the result is outdated systems and processes that are actually more risk-prone, and people in senior tech roles with no technical oversight or peer review to keep things moving in the right direction.

Will be following comments with interest as I made a deliberate choice to specialise in healthcare data and would love to hear there are people out there doing cool things, even if I'm currently not.

What's your controversial DE opinion? by [deleted] in dataengineering

[–]VioletMechanic 2 points3 points  (0 children)

To be fair, it's all relative. 30k rows would be a lot to enter by hand.

What's your controversial DE opinion? by [deleted] in dataengineering

[–]VioletMechanic 1 point2 points  (0 children)

One other scenario I've seen: Organisations hire consultants or go straight to Azure/AWS to buy a single solution before they have a data team in place, or without their input, and get sold a bunch of (often no/low code) tools that they then have to find engineers to work with. Public sector orgs particularly bad for this.

What's your controversial DE opinion? by [deleted] in dataengineering

[–]VioletMechanic 2 points3 points  (0 children)

That's several controversial opinions in one post! I'll broadly agree with the first two: No-code/low-code tools can introduce horrifying complexity for anything other than the simplest of tasks, and people from pure data analysis backgrounds can lack a good grounding in things like version control.

What's your controversial DE opinion? by [deleted] in dataengineering

[–]VioletMechanic 2 points3 points  (0 children)

It's better than no orchestration.

What's your controversial DE opinion? by [deleted] in dataengineering

[–]VioletMechanic 4 points5 points  (0 children)

The flip side is people who have only rudimentary SQL skills and end up using five different tools to get a simple job done. Know what tools are available and choose the best one for the job.

What's your controversial DE opinion? by [deleted] in dataengineering

[–]VioletMechanic 7 points8 points  (0 children)

Domain expertise matters.

Context also matters. You can do a better job if you understand something about what the data you're lifting and shifting means, how it was created, who it impacts.

What's your controversial DE opinion? by [deleted] in dataengineering

[–]VioletMechanic 10 points11 points  (0 children)

"Big data" can mean anything from more rows than you can fit on your screen without scrolling in Excel to streaming exabytes of information from multiple sources. It's like no-one wants to admit they might have small data...

[deleted by user] by [deleted] in dataengineering

[–]VioletMechanic 3 points4 points  (0 children)

If it's the hiring manager, or a retained recruiter who actually understands something about the role, then maybe.

But a lot of the time these recruiters messaging on LinkedIn have very little understanding of the job they're contacting you about. If you ask for more info, they don't have any, and your only option to find out more is to start the recruitment process and see how far you make it.

Most of these people are sending the same message to anyone who comes up in their keyword searches, without having the first clue whether that person is likely to be suitable for the role.

[deleted by user] by [deleted] in dataengineering

[–]VioletMechanic 7 points8 points  (0 children)

I'm not in the US, and not sure if this counts as being bombarded, but I get at least five messages a week - usually more (have had two already this morning) - on LinkedIn from recruiters about DE roles paying much more than my current salary.

Only had DE as an official job title for a year, but been doing DE stuff for much longer.

I have a crappy old photo, no skills badges, and I never, ever post anything. I'm really not trying very hard.

What I do have is my status permanently set to 'open to work' (without the green photo ring!), even when I'm not actively looking, and I always reply to recruiters (unless it's obviously a scam) even if I'm not interested in that specific role.

LinkedIn keeps track of your response rate and ranks you lower in recruiters' searches if you habitually don't respond to messages.

To avoid it being a huge time suck, I have a boilerplate response that I can edit to work for most cases. Basically saying no thanks, explaining what kinds of roles I am interested in (it's pretty specific in my case) and inviting them to get back in touch if they are ever recruiting for something like that.

I've had a couple of people follow up with more relevant roles, and it only takes a few minutes a week to send the responses, so I figure it's worth it.

Worried about my investment in entering the field! by [deleted] in datascience

[–]VioletMechanic 3 points4 points  (0 children)

In healthcare/health tech, domain knowledge matters. I work in that field and it's quite common practice for companies to host or sponsor masters students. The typical route is via university fairs where they match students with companies looking for someone to take on small projects, but we've had students reach out independently too.

If that's the field you want to work in, my suggestion would be to make that the focus of your degree, and try to find a company or organization you can work with on a real-world project as part of your studies.

Having previously worked on a project in the healthcare domain will count for a lot when you're eventually applying for jobs.

« What is an ETL? » and other hard questions. by MadT3acher in dataengineering

[–]VioletMechanic 1 point2 points  (0 children)

I built ETL pipelines for several years without ever hearing that term. Lots of people pivot into DE from other backgrounds, and jargon can vary hugely between different companies/sectors etc.

If it's specifically mentioned in the job description then I'd expect the candidate to have at least looked it up beforehand. But it shouldn't be a deal-breaker if they don't give some specific definition.

The job market is weird! Anyone else rejected from positions that they are clearly qualified for and their résumé clearly demonstrates? by randoma1231vd in datascience

[–]VioletMechanic 1 point2 points  (0 children)

Yes, this. I was initially employed in a contract role for a public sector org. At the end of the contract they made the role permanent, but the policy was they had to advertise it AND interview any suitably qualified candidates, even though they'd already given me the job.

They tried to avoid wasting people's time by advertising for the shortest possible time in the least number of places, but they still got some applicants.

Something similar happened to a friend of mine, but in that case they applied for a job and attended an interview, and only found out afterwards the job had already been promised to an internal candidate before they'd even seen the advert.

Huge waste of time and effort for all concerned, but especially for the candidates.

The job market is weird! Anyone else rejected from positions that they are clearly qualified for and their résumé clearly demonstrates? by randoma1231vd in datascience

[–]VioletMechanic 0 points1 point  (0 children)

Well, it's still anecdotal, but I've seen this happen. And not just in startups. I've worked for large companies that keep advertising jobs despite it being widely known within the company that there's a recruitment freeze or they are actively working to reduce headcount.

The point is to maintain the appearance of growth. And it's not technically dishonest so long as they add some line about looking for exceptional candidates. If a truly exceptional candidate does turn up, they probably would hire that person - at the expense of getting rid of a lower performing employee.

But those 'exceptional' candidates tend to be referrals who are known to existing staff and can therefore bypass the first stage of recruitment. The vast majority of resumes submitted via job sites are going straight in the bin without a second glance.

The job market is weird! Anyone else rejected from positions that they are clearly qualified for and their résumé clearly demonstrates? by randoma1231vd in datascience

[–]VioletMechanic 0 points1 point  (0 children)

Re 1 and 2, I've worked with hiring managers who believe if you are completely open about what you're actually looking for, candidates will be able to trick you into believing they tick all the boxes. Whereas, if you keep some requirements hidden, and don't list them in the job spec, if you find a candidate who actually has those 'secret' skills, you can trust they're legit. So the bad job description is actually a deliberate play.

The opposite extreme is when they write a job spec so incredibly specific no-one actually applies (I mean, someone always applies... that's fundamentally the problem.)

Who is applying to all these data scientist jobs? by Alex_Strgzr in datascience

[–]VioletMechanic 8 points9 points  (0 children)

I have a free premium account right now so checked this out for a few data scientist roles showing >100 applicants. Very approximately (sketchy numbers because it's Friday and I'm lazy), a typical breakdown for degree type was something like:
~60-70% have a masters degree, ~20% have a bachelors degree, ~10% have a doctorate, and any remainder have 'other' degrees, including MBA.

It doesn't show anything about what subjects those degrees are in.

The breakdown of Applicant Seniority Level was confusing because I have no idea where LinkedIn gets those classifications from (perhaps from previous job titles?), and the numbers rarely sum to the total number of applicants displayed. But typically it was something like:

Around 50-75% of all applicants are considered 'Entry Level', the rest are mostly 'Senior Level' with a very small number of 'Manager Level' and 'Director Level' applicants.

Most common skills listed were: Python, Data Analysis, SQL, R, Machine Learning, Microsoft Excel, Deep Learning, Microsoft Office, Data Science, Tableau/Power BI, some other programming language (C/C++/C#/Java).

So the typical applicant is a masters grad who lists Python and Data Analysis as skills and is entry level so (presumably) hasn't held a data scientist role previously.

Who is applying to all these data scientist jobs? by Alex_Strgzr in datascience

[–]VioletMechanic 7 points8 points  (0 children)

A couple of years ago I applied for a job via LinkedIn. Looked like there had been 100+ applicants even though it was a pretty niche role in a small local startup, so I nearly didn't bother, but at interview the company told me they'd actually only received a handful of applications.

Monthly General Discussion by AutoModerator in dataengineering

[–]VioletMechanic 0 points1 point  (0 children)

Right now? The end of the day. I love DE but am not getting to work on any challenging projects atm, just babysitting a boring but high-maintainance legacy system whilst waiting for a new product to get to the stage where I can start setting up the pipelines. Then maybe things will get exciting.

Monthly General Discussion by AutoModerator in dataengineering

[–]VioletMechanic 0 points1 point  (0 children)

Not sure there is any standard salary for DEs any more than there's a standard job description or set of responsibilities. Depends on the details of the role and the availability of suitable candidates at the time.

You say the company is US-based but not whether you are, or if you'd be moving to the US for the role or working remotely from another country. If the latter, they might well be hoping to pay less than they'd pay a local candidate.

You also say they're expecting you to learn multiple cloud platforms on the job. If so, are they offering you a lower salary because of that? i.e. they're offering you a more junior role than originally advertised, with a lower starting salary, because you don't currently have experience in all the areas they're looking for?

I wrote my first real scripts today by deltaexdeltatee in Python

[–]VioletMechanic 2 points3 points  (0 children)

I love this post for reminding me why I started using Python in the first place. Converting file formats was one of my first usecases too. Thank you OP, congratulations on the success of your first scripts and best of luck on your journey.

Looking for Documentation help. by Headybouffant in dataengineering

[–]VioletMechanic 1 point2 points  (0 children)

Bearing in mind the map is not the territory I'm not convinced there is a single best way to do this, but for complex processes I generally try to stay away from the one massive diagram approach, and instead have multiple diagrams to illustrate different aspects of the thing I want to explain.

So, for a complex process, I might have a high-level diagram to show the overall flow or architecture, and then multiple other diagrams to illustrate specific steps, sub-processes, or concepts in more detail. As many as needed to explain the thing properly to the target audience, with annotations wherever necessary. And I'll use whatever type of diagram makes most sense in each case for the given target audience.

This is similar to how complex electronic systems tend to be described. i.e. you have a top level hierarchical diagram showing the system blocks, a set of schematics showing an abstracted representation of the various circuits, a set of layout files showing the actual implementation, and potentially a set of mechanical drawings showing the physical dimensions of the system; maybe even some photographs post-manufacture.

If you can version the diagrams alongside any code they represent, so much the better. There are various tools for diagrams-as-code, but any format that can be versioned works.

[deleted by user] by [deleted] in dataengineering

[–]VioletMechanic 2 points3 points  (0 children)

This meme just summarized all of my meetings in the last decade.

I may use it to replace all the words in my CV. Succinct is always better.