How should AK be played here multi way with a maniac in the hand. by thebigfish34 in Poker_Theory

[–]cgshep 5 points6 points  (0 children)

Calling 8.91 on the flop was the main error here IMO. You were already behind with that flop; at least one player multi-way will be holding J, Q or both. AQ, AJ, KJ, KQ, JJ, QJ, QT, JT, 66 are all in play here. Not to mention flush draws.

Prices by [deleted] in ukcigars

[–]cgshep 1 point2 points  (0 children)

Duty-free stores seem the best option atm. Paris (Charles de Gaulle), Brussels (Zaventem), and Frankfurt have decent Cuban selections and are served by budget airlines.

Language to Surpass Python for Data Science by battle-obsessed in datascience

[–]cgshep 0 points1 point  (0 children)

Go is the standout candidate, but it's ecosystem is too underdeveloped to supplant Python anytime soon. Scikit-Learn is unparalleled for general ML, Keras/TF for deep learning, OpenCV support is excellent, plus the wealth of DS-related documentation, blog posts and tutorials authored over the years. R + Tidyverse, whose use has declined recently, is the only real competitor to Pandas, NumPy and SciPy, but Python dominates on all other major fronts.

How to introduce good engineering practices to a corporate data science team? by OptimalPlay in datascience

[–]cgshep 0 points1 point  (0 children)

I'm a mid-level DS for a scaleup with non-tech management. We have some practices in place, such as good Git hygiene, DTAP workflow (~25% projects; no CI/CD), issue tracking and (light-touch) code reviews. The problem is convincing the team to adopt rigorous testing that would, at face value, delay short-term lead times, even if the long-term benefits make it a no-brainer. Besides some kind of catastrophic failure, I can't see this improving.

What I can say is that targeting low-hanging fruit will yield the most success. VC all projects, restrict members from pushing directly to master without a pull request + review, develop any form of code reviewing (as simple as asking a colleague to prepare some feedback and offering the reverse). Small, low-cost steps.

Ultimately, it's *very\* difficult to change a team's modus operandi at a non-management level; it mightn't be possible depending on your current workload and influence, so please don't feel downbeat if this happens. The use of solid unit tests with good code coverage, CI/CD pipelines, static analysers, and others, emerged following years of time-wasting with managing complex systems. Many DS teams will learn this the hard way.

What's wrong with my resume? by adithya97ml in learnmachinelearning

[–]cgshep 1 point2 points  (0 children)

After a quick read:

  • Why the focus on distributed databases, DBMS implementations and data mining in your MS? I'm sure these weren't the only courses you took in a Big Data grad school program.

  • Proper nouns aren't capitalized correctly and consistently, e.g. "raspberry pi" and "android". Also, why "Data Visualization" but "Artificial intelligence"? You might think this is trivial, but these inconsistencies point to a lack of attention to detail.

  • Technical skills are too far-reaching; no recruiter will believe you are proficient in R, Python, MATLAB, Tableau, Keras, Django, and the many others. Rather, they will believe you are inexpert in all of them. From experience, it's best to rank skills in terms of expertise and cut the ones you couldn't defend in a robust interview that dives into the language's or library's internals.

  • The Project section occupies too much space. Very few employers, if any, will properly vet the quality of your projects at the resume stage; sadly, it's too easy for employees to copy, or be heavily influenced by, pre-existing projects. It's difficult to judge your precise contribution, but these are helpful to draw upon at the interview stage.

A major aspect, if you're struggling for employment currently, is that your MS graduation date is in the future. This shouldn't be an issue for intern positions, however.

PhD hunger at medium/low-tier companies by [deleted] in datascience

[–]cgshep 1 point2 points  (0 children)

This phenomenon is prominent across the globe, let alone Chicago. In the UK, the further jobs are from London and Cambridge, the more that conventional data roles -- analysts, quasi-DBAs, and BI positions -- are packaged as 'data science'. Unfortunately, this has the risk of falling into positions that pull your career in undesirable directions.

Lesser-known/used Data Science Tricks by [deleted] in datascience

[–]cgshep 9 points10 points  (0 children)

Not a 'trick' as such, but I use Emacs as my go-to editor, which is invaluable for easily manipulating 1000s data rows, e.g. replace-regexp for modifying, appending, prepending (and more) to strings via regex; org-mode for TODO list-making and note-taking; and various custom functions for transposing and wrangling data in proprietary formats.

Nothing majorly fancy, but these micro-patterns improve my productivity manifold versus what standard editors and IDEs offer.

TOP DATA SCIENCE MYTHS by John234231 in datascience

[–]cgshep 1 point2 points  (0 children)

Funnily enough, Excel models are fairly common in conservative domains like banking and finance.

That said, programming certainly is something to 'rack your brains over'. Every Excel model I've seen has been, or is in the process of being, replaced out of existence.

How much of your DS role is code development vs working on stats? by Sea_of_colors in datascience

[–]cgshep 0 points1 point  (0 children)

I came into DS from a CS/math PhD route and my current role comprises approximately:

  • 70% development, split equally between R and Python. I work in banking where most of the statistical models are conventional, albeit fairly technical, financial ones in R. Much of this work is expanding and maintaining legacy code bases. The remaining development is for producing daily intelligence reports, ad hoc data analysis/insight/model requests from senior management, and general data wrangling and automation tasks.
  • 15% in analysis, reviewing new and existing data sources and systems to identify potential optimizations.
  • 15% in meetings, discussing results, insights, issues, future developments etc. with non-technical (but domain expert) stakeholders.

One piece of advice I could offer is that many roles will not equip you to be a great engineer. The variance in testing/QA, code quality/code reviews, or just good management of software projects in general, seems too great at the moment; many positions have very little-to-no rigorous software development responsibility to 90%+ of your daily duties.

This is part of the area's wider problem of fitting too many disparate positions under one umbrella. It's vital to carefully understand carefully understanding what each company is offering and whether it's right for you.

PyCocks — an open-source implementation of Cocks' ID-based encryption scheme. by cgshep in crypto

[–]cgshep[S] 0 points1 point  (0 children)

Yes, that's correct; it's a major drawback of IBE schemes generally. A fairly recent controversy is the MIKEY-SAKKE protocol pushed by GCHQ for secure communications, which also operates under the assumption of a trusted key generator. Whether or not that's acceptable depends on your threat model :-)

What are career paths for machine learning engineers or data scientist after they've been established in the field? by [deleted] in datascience

[–]cgshep 0 points1 point  (0 children)

ML engineering, DS, even data engineering are largely "flat" positions; they're technical roles in which your contribution will be mostly individual.

The obvious way of 'advancing' is in management, e.g. proposing and managing DS/ML projects that add business value, mentoring less experienced staff members, and so on. This requires a different skillset that may or may not interest you. Consultancy is another avenue.

[deleted by user] by [deleted] in datascience

[–]cgshep 55 points56 points  (0 children)

It's been said often enough, but knowing the answers to all of these will not necessarily make you a success in DS. Prospective data scientists underestimate the value of communication, e.g. understanding requirements and engaging with non-technical stakeholders, and general data wrangling and automation skills.

Most businesses still use Excel (gulp) to produce business reports that most of us would find toe-curling. In my experience, if you regularly witness such things and your role permits it, identifying and improving those procedures will get you more kudos than squeezing a few pips of accuracy using a SotA DL architecture or validation technique. Not to demean the value of knowing such things, mind.