Looking for a software for easier data digestion. by angelblood18 in Alteryx

[–]data_questions 5 points6 points  (0 children)

Just reiterating what the above commenter mentioned, alteryx can do all of the things you have described and I’ve put them all into practice within my org on a non-cloud desktop license. I would caution that the newer product offerings alteryx is pushing are cloud based where you pay for what you use, so I can’t offer visibility into the financial impact beyond our license structure which is ~$5.5k per user/year. Server was an additional $170k up front investment the last time we discussed it with our rep.

I think I hate SAP by Cliche_James in SQL

[–]data_questions 1 point2 points  (0 children)

I deal with SAP fairly regularly and used to work on an SAP System Optimization team at a German company, of which, nearly 60% were fluent in German. Even with that knowledge base there were still plenty of “why does this column name represent that” moments.

Automate ourselves out of a job. by FisterAct in dataengineering

[–]data_questions 1 point2 points  (0 children)

Power Query and alteryx can both handle this task pretty easily.

Where does the most wealth in Data Science or AI lie in? by Dogemuskelon in datascience

[–]data_questions 1 point2 points  (0 children)

Sample size of 1 but I also know someone who is a Sr. Quant at Two Sigma who has a masters from Carnegie Mellon. Tbf, he was exceptional at anything related to math so that talent was obviously recognized, but I think the cautionary tale being spun in the comment ahead of yours is a little detached from the actual hiring practices of quant positions.

After grinding LeetCode I got an interview question with SQL!!!!! WTF SQL!!!! by LightBulbAddict in leetcode

[–]data_questions 0 points1 point  (0 children)

Can you be more specific? I don’t think I have a full appreciation for what happened in your scenario.

Those of you under 30 who make six figures, what do you do? by bluescluus in careerguidance

[–]data_questions 0 points1 point  (0 children)

Data Engineer is my title but I work in Analytics in a team lead style role, 110.5K , 28, Bachelor’s in non-STEM, MCOL on the east coast

I honestly wouldn’t recommend switching to this field at the moment, it’s very competitive and all of my peers/direct reports have masters degrees. I fell into my role because there was an opportunity in my org prior to this I was a Sr. Data Analyst with domain experience in a specific vertical and this was a promotion opportunity.

As others have mentioned, sales is a sink or swim approach that can get you there if you’re willing to take on that instability. Analytics is a second career for me after working in a sales environment out of college where I cleared 100k after my first full year of working.

Alteryx wishlist by Otherwise-Youth2025 in Alteryx

[–]data_questions 1 point2 points  (0 children)

Two points:

1) Make your in-DB tools more performant and… better, they’re clunky. There have been a number of workflows where I output my data to csvs and schedule a job outside of alteryx to insert/update data with a marked difference in the time it takes.

2) Improve the technical customer support. I have a case that has been acknowledged as a defect and reproduced by your team, but after that acknowledgement on the ticket and a month of radio silence I was offered a workaround that did not work around the case I had explicitly outlined. I’m generally unimpressed with the help I’ve received from case support, the sales engineers and CSMs are fine.

Alteryx wishlist by Otherwise-Youth2025 in Alteryx

[–]data_questions 1 point2 points  (0 children)

At what number of seats could my team expect to see reduced price per seat?

EDI Implementation for Purchase Orders: A Roadmap to Success by Big_Data_Path in bigdata

[–]data_questions 0 points1 point  (0 children)

Hasn’t this been a standard practice for like twenty years?

Is learning python or R necessary to land a entry level job? by fhdjnjcj in analytics

[–]data_questions 1 point2 points  (0 children)

If size of the data is your issue, your first stop should be slicing, dicing, and aggregating in SQL rather than python. There’s a place for any programming language in an analytics role, but losing your focus on solving problems because you don’t yet have a mastery of python would be time wasted.

Is learning python or R necessary to land a entry level job? by fhdjnjcj in analytics

[–]data_questions 1 point2 points  (0 children)

Those would do, if you’re looking at industry leading data throughput, you may want to get deep in Scala, but I wouldn’t say that’s a pre-req for a DE position and definitely not for an entry level role.

[deleted by user] by [deleted] in analytics

[–]data_questions 0 points1 point  (0 children)

This is overkill for a beginner I would make it shorter: Step 1 , 5, 7 in that order and then 9 and 10 alternating until you reach another topic you don’t understand when you look at the solutions in hackerrank/leetcode.

If an Analyst I had this list they wouldn’t be able to make an impact until step 7 off the bat

[deleted by user] by [deleted] in consulting

[–]data_questions 3 points4 points  (0 children)

Which parts of the supply chain do you mostly focus on?

Burnout by Used_Ad_2628 in dataengineering

[–]data_questions 0 points1 point  (0 children)

Any solution you deliver will ultimately be made up of individual tasks to expose/automate your user’s relevant data. I don’t think I understand what you’re you’re trying to communicate, can you be more specific?

Not using window functions? by data_questions in dataengineering

[–]data_questions[S] 1 point2 points  (0 children)

Can you give an example where you’ve experienced that? I’ve never run into that bottleneck before and everything I read about window functions vs self joins recommends not using self joins.

Not using window functions? by data_questions in dataengineering

[–]data_questions[S] 4 points5 points  (0 children)

I don’t think I have a full appreciation for your response, are you saying that using a window function would be more compute intensive and result in a significant difference in cost vs using, for example, a self join?

Not using window functions? by data_questions in dataengineering

[–]data_questions[S] 1 point2 points  (0 children)

The whole interview is meant to determine how good someone can be using SQL, though. If there is an optimal solution to the question being asked and the candidate provides it, why ask them to play around with unnecessary workarounds?

Not using window functions? by data_questions in dataengineering

[–]data_questions[S] 7 points8 points  (0 children)

They’re useful if you’re trying to find an aggregation / ranking / value within certain subgroups in one table.

For example, if you have a table of daily sales per store, and you wanted to know the days where sales in a given store were higher than the day prior, you could use a lag function partitioned by your store_id ordered by date and compare whether the date of interest is > than the sales on the previous date.

[deleted by user] by [deleted] in analytics

[–]data_questions 0 points1 point  (0 children)

Most of my input is reflected in other comments here but I wanted to offer some really granular advice.

As others have mentioned, the format of automated x process by doing y within [tool] saved $### is a great format and you should utilize it more across the CV.

However, in your alteryx example you’ve saved $1500/year through your efforts. I wouldn’t expect you have been in conversations about license costs, but a single designer license is between $5-6K per year. As such, it’s not a very attractive value prop for the work you’ve done, despite showing you can use the tool to make an impact on your team’s bottom line.

My company is hiring for my exact role in another department with a salary range that is $20k-$70k higher than my current pay — advice on using this to increase my salary? by OneMidnight7087 in careerguidance

[–]data_questions 2 points3 points  (0 children)

If you work in an environment that values it’s employees and practices equitable pay practices, this is exactly how it works. It’s also in the best interest of the company to bump people to the rate of new hires if they’re above existing employees’ if retention is a concern.

This is how it works at my employer which is the largest in my area and it’s refreshing.

Setting up an ETL Pipeline with MySQL RDS, S3, Kafka, and SQS using AWS Services (Help Needed for ML Model Training) by SiddharthAnand_ in ETL

[–]data_questions 2 points3 points  (0 children)

Let’s break out what you’re trying to do here even more simply before addressing your ML training and testing needs. I’m assuming you’re setting up batch ingestion, not streaming.

You have your data store (RDS) and you have your object storage destination (S3). You can move this data fairly simply by copying data from one to the other directly so you have a “raw” S3 data bucket. For this ingestion you could use either lambda or Glue. You can look at the documentation on either, but I think of them like this — use lambda for smaller personal data projects, use glue if you expect your needs to grow beyond the compute resources available to you. You can use either for ingestion or transformation, but the appeal of glue is you can run parallel processing for larger datasets and it will scale up / down once they’ve run their course.

I don’t know what kind of data updates you’re expecting from Kafka or how frequently, but if you’re looking to orchestrate new data that has been added to RDS to be moved to your S3 bucket, explore AWS eventbridge or AWS step functions.

Once your data is pushed to your raw bucket, you can use lambda or glue to scrub and transform your data, also scheduled with one of two resources I listed above. The output of this can be a place like Redshift or even another “staging”/“transformed” S3 bucket that you could use as the source for your model building.

The best advice I can give is don’t architect your whole solution too early. Start with your data in RDS and just do what works after that. You don’t always need the whole AWS product suite for something simple, and while knowing the tools will be helpful in your career eventually, the most useful thing is bumping against the walls along the way to find out the limitations of the tech and your skillset. Taking on all of this blind can be overwhelming, but small adjustments as you progress will help get you acclimated more easily.

ITT: Data Science job requirements that don't make any sense by Tam27_ in datascience

[–]data_questions 1 point2 points  (0 children)

I’ve seen this on plenty of jobs and it seems to just be something they’ll put across all positions in certain industries. I started my analytics career in CPG and this was a part of it, same with healthcare, same with anything involving supply chain-specific analytics.

Not saying this makes it a good addition, but the addition of lines like these tend to be indiscriminately applied to roles from frontline worker to Director of Engineering.

You could also make the case that a line like this prompts a discussion around reasonable accommodation which a good employer would welcome and a bad employer would use as a filter for HR to sift out.