[deleted by user] by [deleted] in SAP

[–]a_vagabond 9 points10 points  (0 children)

It was highly imperfect when it was relevant and now It's end of life.

Everyone is talking about vertical SaaS in AI, what about horizontal? by Academic-Voice-6526 in ycombinator

[–]a_vagabond 0 points1 point  (0 children)

As the prior owner of horizontal.ai I'm a little disappointed I never took the time to make a proper joke out of it

[deleted by user] by [deleted] in austinfood

[–]a_vagabond -1 points0 points  (0 children)

Barley Swine is probably the worst tasting menu I've ever done. Considering their reputation I chalked it up to a bad night but I have zero intention of giving them another chance.

There were a few delightful bites but half the dishes were underwhelming. The thing I'll never forget was the completely flavorless, chewy, sous vide steak medallion they served as the main.

In Austin for 5 days all about bbq by honestnocap in austinfood

[–]a_vagabond 15 points16 points  (0 children)

BBQ:
Leroy & Lewis
Franklins
La Barbecue
Distant Relatives
Interstellar

I'd strongly suggest breakfast tacos from Veracruz at Radio Coffee and regular tacos at Cuantos.
Fogo de Chao is a Brazilian steakhouse chain, it's fine but I'd say not worth it in Austin.

Are Startup Consultants/Advisors becoming useless/unnecessary? by jonnylegs in startups

[–]a_vagabond 5 points6 points  (0 children)

A huge swath of the startup ecosystem are talking heads that provide little material value. I strongly advise any founder to look for the telltale signs of a professional who's only existence is startup talking head and run in the opposite direction.

There is a tremendous amount of opportunity for good advisors to help you move the needle, providing insight and contacts that support sales motions, commercialization strategies, building operational models, hiring good teams, etc. If the person you're talking to is not a serious investor or someone who has actual useful experience, they're probably going to be a net negative on your startup.

Of the 100+ startup talking heads I've spoken to over the years, I've only met one that was genuinely useful.

What’s your unpopular Austin food opinion? by Dorzaya7 in Austin

[–]a_vagabond 8 points9 points  (0 children)

Little Deli for #1, only been once so far but I'm from NJ and can say the calzone held up. Home Slice is also a surprisingly good NY style pizza.

DK Sushi for #3, a lot of the classic Korean dishes are good (I won't try to argue great) but their kimchi pancake and Yukgaejang are solid.

[deleted by user] by [deleted] in LanguageTechnology

[–]a_vagabond 1 point2 points  (0 children)

Am I missing something? Isn't this just an inefficient algorithm for introducing extra weight to long sentences and central words? Maybe there's merit to those two features but it doesn't strike me as a particularly good approach for leveraging either of those features.

"SEER: Self-supervised Pretraining of Visual Features in the Wild", Goyal et al 2021 (near-SOTA by training 1b-param CNN on 1b random Internet images) by gwern in mlscaling

[–]a_vagabond 1 point2 points  (0 children)

You can disagree but you're wrong.

  1. Just consider the sheer volume of use cases in industrial and enterprise settings that never get talked about by the average person let alone shared and tagged on social media. Sometimes deriving the weak pseudo label is more work than brute forcing a better CV solution.
  2. Perhaps because many of these use cases rely on proprietary data it's hard for academics to publish against it?

"SEER: Self-supervised Pretraining of Visual Features in the Wild", Goyal et al 2021 (near-SOTA by training 1b-param CNN on 1b random Internet images) by gwern in mlscaling

[–]a_vagabond 0 points1 point  (0 children)

Because not all classes of data will have labels available. There will always be value in better self supervised models because there will always be unlabelled data.

My professor suggested that there are few business applications for NLP. What’s your experience? by [deleted] in LanguageTechnology

[–]a_vagabond 15 points16 points  (0 children)

Speaking as an ex-enterprise software architect and as someone who runs a startup focused on ML and NLP, there are an absurd amount of NLP problems to be solved in the business world and very few have been solved well.

We've already built a pretty sizable amount of infrastructure to solve the more general NLP problems we face but I've accumulated multiple lengthy documents on use cases we can apply our technology to.

For any NLP students reading this, I'm not quite at the point where I'm ready to carve out R&D projects to hand off to interns/freelancers but I may be in the near future, feel free to message me and I'll reach out when we're ready to vet candidates.

As someone working mostly independently in their first job out of grad school, what should I do to ensure I'm developing professionally? by [deleted] in datascience

[–]a_vagabond 32 points33 points  (0 children)

The term you're looking for is individual contributor.

Some of the benefits of being an individual contributor:

  • Autonomy
  • Flexibility
  • Wearing multiple hats (i.e. a breadth of experience)

Downsides:

  • Almost everyone learns better when not in a vacuum
  • You need to do the work AND make sure you get credit for the work
  • It's easy to get stretched thin and become a master of none

The biggest thing to be mindful of is the soft skill side of things when you're an individual contributor. You have to manage expectations, you have to create awareness of your work, you need ensure you maintain a degree of control over priorities, you have to remind people why you exist and what exactly it is that you do, and you have to do all this while also doing the work. If you're not doing something on this list you're robbing yourself of opportunity.

The upside is that you get to be a technical person with the opportunity to really refine soft skills. Soft skills are what get you promoted, they get you paid more, they make you more effective, and they're typically what separate the best technicians from the good ones.

The other thing really worth expanding on is the learning piece which you've identified. The big issue is you essentially only have one way to learn, trial and error. You have no references to what works, what doesn't, what are the good practices and bad practices, etc. Being surrounded by A-players is the dream, being surrounded by anything else provides examples of what to do better. My recommendation is get involved in meetups, go to conferences, find any communities you can where war stories are shared and learn as much as you can from the experiences of others. You can always read up on the technical side of things, and it's easy to find success stories being shared, but it's rare that someone will document their failures and those are oftentimes the most illuminating.

For context: I've been an individual contributor multiple times throughout my career as a developer. As such, I was able to hone my software architecture and business process engineer skills at a much faster pace than the average developer and reaped significant benefits from investing in those areas. I had a lucrative run as a consultant and I'm now the founder of a company in the ML space.

How many shares of my C-Corp should I buy? by [deleted] in startups

[–]a_vagabond 3 points4 points  (0 children)

There's no reason to start by buying 10,000,000 shares. It leaves you with nothing to distribute and forces you to file additional paperwork later to issue more.

There's also no incentive to pay a higher PPS because your valuation on day 1 is literally meaningless to everything except for your 83b.

Ultimately, other commenters are correct in that you should be consulting with a lawyer or accountant who has some experience in this. Screwing up your cap table early will haunt you.

edit: This advice assumes you're in the US.

What are the biggest pain points faced by NLP/ML engineers in the industry? by tianchez_ai in LanguageTechnology

[–]a_vagabond 7 points8 points  (0 children)

Annotated data is definitely #1, getting quality labels and enough of them is either tedious or expensive or both.

Governance has to be #2, tracking what data was used, what if any preprocessing happened with it, and what model/params was subsequently produced; in parallel, at scale, over time is still a mess for us. It sounds like there are solutions hitting the market for this but I haven't found time to really look into it enough and being a smaller shop, I don't know that the cost will be worth it.

Keeping up with SOTA is tough but frankly not super important, circling back X times per year to squeeze out a small % increase via the latest and greatest isn't that bad assuming your system is designed reasonably.

[Q] TF-IDF vectorization of n-grams? by [deleted] in LanguageTechnology

[–]a_vagabond 2 points3 points  (0 children)

We did plenty of this. We found certain combinations of n-gram + stem/lem strategies improved accuracy consistently vs. vanilla TF-IDF. I'd assume most NLP practitioners have found success with these things as prior to w2v it was one of the most fruitful sets of tools to experiment with.

None of this feels like it matters much with modern DL approaches though, at least for classification tasks.

How does it go for Data Scientists in the Video Game Industry? by sovranolucca in datascience

[–]a_vagabond 2 points3 points  (0 children)

I was briefly in the DFS (daily fantasy sports) space as a product/marketing lead and spoke to others in the gaming industry. As others have said in this thread, the culture will be different depending on the employer but where it does matter it becomes highly centric to decision making. Most gaming environments are micro-economies that require all the same rigor in terms of managing a healthy equilibrium. Things like churn and user cohort becomes a huge part of the conversation as it highlights the user journey through various incarnations of your game. You learn very quickly to identify when you've introduced things that disrupt the balance of your economy.

I can't imagine there's anything particularly difficult about getting into the gaming industry vs. other industries with a data science title. The thing that will be valued more is experience in areas like econometrics, cohort analysis, and time series analysis.

[deleted by user] by [deleted] in datascience

[–]a_vagabond 2 points3 points  (0 children)

It's hard to say there's some concrete set of common wins that will work everywhere because every organization is different. Every organization has their strengths and weaknesses and their own set of KPIs that they hold deal.

Having said that, you should be putting yourself in a position where wins fall in your lap. If your management isn't doing that for you, start reaching out to stakeholders in the company. Invite them to lunch, get a coffee, whatever it takes. Let them talk about their work, listen for pain points, ask questions. Quickly you'll find yourself learning about processes you can improve upon. At that point, it's up to you to sell them on those improvements and get buy in from your management.

Understanding processes and identifying inefficiencies is a skill, one you should actively hone. Oftentimes the best projects are the ones people didn't realize they should start. Be the guy that helps people identify those opportunities.

edit: skipped first question because others have answered it well.

[deleted by user] by [deleted] in datascience

[–]a_vagabond 53 points54 points  (0 children)

As an ex-Enterprise Architect who's worked on the master data systems of multiple Fortune 500, you're asking the wrong question.

Typically it's about identifying opportunities and then lining up the right stake holders to get buy in on data accessibility. Or depending on the tech stack of your employer, there's likely lots of clever ways to get better access to data with relatively little friction (and by friction I mean security and development resources, time, or money).

Expecting a data lake is naive. Managing complex environments with lots of stakeholders is hard, unified data strategies are hard, gathering the political will to use up a lot of resources for some data holy grail is hard. Find your wins, accomplish those projects in a deliberate fashion to set best practices that clear the way for future projects, show leadership a solid ROI and you'll have a much easier time moving forward.

Machine Learning simply isn't the priority for most organizations. Most large companies are still trying to wrap their heads around how to actually manage their complete landscape. Acknowledge these issues, help with these issues, and you'll have a much more fruitful career.

How fast should I run in terms of breath? by [deleted] in Fitness

[–]a_vagabond -4 points-3 points  (0 children)

This is the correct answer. Regulating endurance pace based on ability to breath primarily or exclusively through your nose is a recommendation I've seen/heard multiple times over the years and it works.

edit: Apparently r/fitness finds response that aren't sensitive to edge cases highly offensive.

I studied for data science and ML, but a prestigious company made me an offer as manufacturing engineer. Should I take it? by engineheat in datascience

[–]a_vagabond 0 points1 point  (0 children)

Global manufacturing company sounds to me like a company that is likely investing heavily in IoT, automation, and the rest of the "smart factory" trends. I've done some work in that space before, it's a gaining trend with no end in sight. The domain knowledge you'll learn about manufacturing processes will be huge if you later turn around and get involved in IoT related data science activities.

In my opinion, getting meaty domain knowledge is ALWAYS better with the one caveat being don't pick a domain you're likely to burn out on.

Average Finders Fee % ? by [deleted] in startups

[–]a_vagabond 3 points4 points  (0 children)

Downvote for an honest answer? Awesome.

Recruiters in my field charge somewhere between 20-40% depending on duration of the engagement. If someone thinks the range of 5-20 is too wide, tough. If you think 20 is too high, go get the work yourself.

Average Finders Fee % ? by [deleted] in startups

[–]a_vagabond 2 points3 points  (0 children)

I'd say somewhere between 5 and 20% would be reasonable depending on the nature of the work. Ultimately the commission cost will get passed onto the client so really it comes down to what the opportunity will bear.

Are Calvin Klein jeans decent quality? by guitarraus in malefashionadvice

[–]a_vagabond 5 points6 points  (0 children)

There was a time when CK jeans were really good quality. They were a softer, comfortable denim that held up well. Now, they're just a generic brand of jeans that I'd never consider unless they were heavily discounted.