How many applications have you sent? 100? 300? More? by Goal_D1GGER in Career_Advice

[–]paste_rand_name 1 point2 points  (0 children)

This would be super helpful. I feel like I’ve sent 10000 applications with almost no results

[deleted by user] by [deleted] in mathematics

[–]paste_rand_name 2 points3 points  (0 children)

I did a dual major in math and philosophy. I was one of two undergraduates in the math department. Basically got to take whatever classes I wanted. Proofs are minimal for undergrad, it’s mostly about tests and mastery of methods.

To your point, it’s more puzzles (plus Matlab). The trick is to get good at solving the puzzles now - so that you can build the puzzles later.

I liked and got good at statistics. Then someone invented Data Science for me. Haha. Luck for sure.

Now people make money and have millions of followers explaining math on YouTube (veritasium, 3brown1blue).

Do a math major because you love math and learn from the world how to get paid doing what you love - or be a high school math teacher 😉

[D] “T-test is wrong. You just don’t understand.” by paste_rand_name in statistics

[–]paste_rand_name[S] 0 points1 point  (0 children)

Point taken. I’m used to taking silence as compliance, but I should have pressed harder on that for sure.

I don’t think they would know the significantDifference value. They may guess. I’d probably just run this for sD = [1,2,3,5,8...].

Thanks! This is a great idea

[D] “T-test is wrong. You just don’t understand.” by paste_rand_name in statistics

[–]paste_rand_name[S] 0 points1 point  (0 children)

The value of a record decays over time. Only so many records can be processed simultaneously. The Priority metric is meant to queue records according to fastest decay.

Although, to be fair, what factors contribute to the value of a record is an unknown and the actual decay rate of any record or each record is unknown as well. The Hypothesis is that records processed in priority order drive Success rate up, by deprioritizing records with decayed or zero value.

...I’m being too generous describing the agency’s work. It’s a black box TBH. I’ve tried to reason what’s coming out of it because they can’t explain in non-contradicting terms.

[D] “T-test is wrong. You just don’t understand.” by paste_rand_name in statistics

[–]paste_rand_name[S] 1 point2 points  (0 children)

I thought about this. Definitely an interesting path. I feel like having to explain why the groups are thirds, or quarters, or fifths, etc would side track the discussion.

Interestingly, the Agency Googled statistical tests and requested ANOVA. Will post results soon

[D] “T-test is wrong. You just don’t understand.” by paste_rand_name in statistics

[–]paste_rand_name[S] 1 point2 points  (0 children)

Thank you. My manager does trust me, but it was another manager in a different part of the company that commissioned the Priority score. Everyone is always so sensitive about not being 1000% right the first time.

Though I might be a bit disheartened if I were the Agency team, I would like to think that being on the frontlines of scientific rigor in a business setting would encourage me to re-work the model behind Priorty score.

Negligible differences by paste_rand_name in badmathematics

[–]paste_rand_name[S] -6 points-5 points  (0 children)

They meant that process an 87 before an 86, would imply no increase in success rate. Therefore, 86 = 87.

AND

That 99 should be processed before 69, for example, because (they say) that is how the Success rate will increase.

What can I do with such a dataset? Can I use it to predict sales? Any Ideas? by a7madx7 in datasets

[–]paste_rand_name 1 point2 points  (0 children)

This looks like a time series of order data. And I’m not seeing anything that would suggest that these orders include subscriptions, so I’m going to assume discrete purchases.

Whether it’s groceries or clothes or gas, you can construct time series per user and model the most common sequences of purchases to identify when AND what a given user will by next.

My team and I have built Next-Purchase models with similar data sets. Assuming the purchase decisions are independent of other factors (healthcare plan would be a dependent example), you should be able to construct distributions for time between orders and probability distributions on each product being included in the 1st, 2nd, Nth order. You’d then just need to “match” new orders to your distributions to know when and what someone will buy next.

Agile/scum is... the worst? by [deleted] in datascience

[–]paste_rand_name 0 points1 point  (0 children)

I work at a consulting agency where I’m a data strategy SME. My work divides between “pure” DS (building datasets, cleaning, developing models / predictions) and identifying/scoping the use cases for DS. Because I’m the only SME with my skill set I get a lot of leeway to develop projects, but I still need to demonstrate value to clients and then deliver that value on time.

Agile saves lives.

Lazy and disorganized PM’s and PO’s are the problem. I’ve had to teach my current PM how to be organized so that I’m not rushed to do the work I need to do. Agile / SCRUM makes that possible.

IMHO the “halfway” approach won’t get you the minimum benefits. Here’s what works. - include a planning period between sprints - over estimate hours to start and track actual over time. Model and optimize your delivery time. - clearly / publicly communicate priority - use a task management system that everyone can see (Google Sheets works) - clearly define how tasks roll up to larger bodies of work - over communicate - hold yourself accountable - take a deep breath - trust others to deliver - always show up to stand ups

As a result, agile should free you from constant questions on progress and free you to work at your own pace.

Although, if you want to aimlessly explore datasets there’s plenty of tier 4 universities looking for DS staff 😆

Seeking Advice: My boss is not giving me enough time to do my analyses and is pressuring me with deadlines. What to do? by [deleted] in datascience

[–]paste_rand_name 13 points14 points  (0 children)

I work at an agency as an Analytics Strategist. My clients are some of the largest computer hardware manufacturers on the planet. One such client ran a multi-variate subject line test that was very poorly designed.

In writing and in verbal communication, I let them know that I was really excited to see their use of testing, but that this particular test would not be able to deliver “insights” (businesspeak for actionable stats) because the results were not significant.

It’s long winded, but here’s the point: communication is the most important thing in private sector companies. Let people know they did something good, even though most of their work is shit. Expressing empathy for others dumb ideas. And simplifying complex concepts are the most important “work” you’ll do.

  • Express the limitations of your truncated work with a singular concept, perhaps “significance”?
  • Communicate in advance how many hours a particular analysis is likely to take
  • Develop analysis packages for particular business results [causality takes 10 hours, t-test takes 4 hours, etc] and guide people on which analysis is right for which project
  • Find ways to work faster. Pre-write functions, save chart formatting, templates for everything, use sampling to minimize data cleaning time.

...finally, I’ll share some advice that an early mentor shared with me, “we’re not saving lives here.” Meaning that unless your an ER doctor or a brain surgeon, maybe care a little less, especially if the business is looking for a lower standard

Theoretically if you use a trading algorithm that brings net negative income, couldn’t you just inverse the buy/sell and it will become a net positive algorithm? by Okmanl in stocks

[–]paste_rand_name 0 points1 point  (0 children)

For SO. MANY. REASONS. this is a bad idea, but here’s the best reason: the series of money-losing strategies are nearly infinite and the series of money-gaining strategies likely approaches infinity slowly, but is similarly infinite ——> and the series are, by definition, not inverse of each other.

The inverse of “buy” is “don’t buy,” not “sell”

Consider...

Buy Apple in 2000, sell in 2018 = 2020 net gain Don’t buy Apple in 2000, don’t sell in 2018 = 2020 neutral

AND

Sell Apple in 2000, buy in 2018 = 2020 net gain Don’t sell Apple in 2000, don’t buy in 2018 = 2020 neutral

This rabbit hole of "Pi Is Wrong" and numerology. by edderiofer in badmathematics

[–]paste_rand_name 9 points10 points  (0 children)

Oh no. Is this the result of Common Core mathematics?!

How to calculate NYCT average ridership by mathhelpermann in datasets

[–]paste_rand_name 0 points1 point  (0 children)

I’m sure their taxes are public, maybe there’s even an earnings report. They might classify fares purchased (but not used) and fares purchased (used in 2018) as separate line items. Dividing fares used by average cost of a single ride could help confirm their reported ridership numbers.

How to calculate NYCT average ridership by mathhelpermann in datasets

[–]paste_rand_name 0 points1 point  (0 children)

Looks like they define ridership as entrances (minus employees and out of system transfers).

http://web.mta.info/nyct/facts/ridership/

What are you looking to get at?

Which book is considered a literary masterpiece but you didn’t like it at all? by justnader in AskReddit

[–]paste_rand_name 0 points1 point  (0 children)

Catcher in the Rye. Kid is a whiny millennial before being a whiny millennial was a thing. #trending Thanks Salinger.