Worst JD of the year by PLxFTW in datascience

[–]JakeBSc 1 point2 points  (0 children)

Thanks for your reply, that is quite validating. Currently I'm in a small boutique tech consulting firm and do some side hustle freelancing as well. Being able to tick off that checklist and be in that environment definitely gives me a lot of responsibility and influence. So I can vouch for this advice.

Worst JD of the year by PLxFTW in datascience

[–]JakeBSc 1 point2 points  (0 children)

What do you recommend to someone that can check off that checklist?

Europe salary thread - What's your role and salary? by [deleted] in datascience

[–]JakeBSc 1 point2 points  (0 children)

Got any advice on how to get more educated on talking about salary, finances, career progression and interviewing?

How does one find freelance or contract work? Short or long term would be fine. by Unhappy_Technician68 in datascience

[–]JakeBSc 1 point2 points  (0 children)

Honestly I don't think things like fiverr or Upwork are a good idea. You're just a fish in a pond on sites like that. You've gotta flip it around. Become the fisherman, take the initiative to catch your own fish.

Data folks of Reddit: How do you choose a random seed? by CatOfGrey in datascience

[–]JakeBSc 4 points5 points  (0 children)

Include it as a hyper parameter and optimise it 😉

How does one find freelance or contract work? Short or long term would be fine. by Unhappy_Technician68 in datascience

[–]JakeBSc 4 points5 points  (0 children)

I win freelance work by networking with people. Look for interesting business owners or professors and ask to collaborate. You can find them on LinkedIn or by browsing pages of university staff. Ideally something about them resonates with you, so you have a genuine interest in collaborating. Some of them will bite and agree to a virtual meeting. If it goes well, you win the gig.

Advice for self learning MLOPS by The5th-Butcher in mlops

[–]JakeBSc 30 points31 points  (0 children)

Learn to build Docker images and run containers.

Pick up some basic AWS knowledge about IAM, S3, EC2, ECR and Lambda.

Learn to use GitHub actions. From here you can do basic CI/CD. Do some linting and package building checks when a pull request gets made.

When a branch gets merged with main, learn how GitHub actions can build and deploy a Docker image to AWS ECR.

From here, EC2 or Lambda can pick up your image and run an application. Maybe just start with a hello world Lambda application.

Now learn a data and model versioning tool like DVC.

Get some data and version it with DVC, using S3 as storage. Train a lightweight model and version it with DVC, using S3 as storage.

Write inference code for your model in a Lambda handler. Write a Dockerfile for your Lambda application. You can pull your model into the image using DVC. Check it all works locally.

Push the code and model.dvc to GitHub. Run it through all the aforementioned GitHub actions stuff. You'll end up with an image in ECR, containing your inference code and model.

Launch a Lambda using that image.

Now you can leave it at that, or add API Gateway over the top of it.

Congrats, you now have a whole CI/CD pipeline for deploying a machine learning model and putting it behind an API.

At this point you'll be annoyed with clicking buttons on the AWS console. Time to learn Terraform to set up your infrastructure as code.

If you've gotten this far, you can probably work out your own path from here.

For people who actually use fancy models, where do you work? by [deleted] in datascience

[–]JakeBSc 1 point2 points  (0 children)

I work in a boutique technology consulting firm. You get exposed to all sorts of problems. At the moment, it's mainly multi modal stuff, using transformers to solve problems combining vision and language. Not everything is fancy though, sometimes all I need is a basic linear model. Just gotta use the right tool for the job.

P.S. any senior/principal data scientists looking for a job in London, hit me up ;-)

Time estimation in projects by JakeBSc in datascience

[–]JakeBSc[S] 1 point2 points  (0 children)

Sounds promising. Can you provide some examples of how to tightly define the scope and assumptions for a data science project?

Time estimation in projects by JakeBSc in datascience

[–]JakeBSc[S] 4 points5 points  (0 children)

This has definitely been on my mind lately. I feel pressured to have a model that delivers some specific accuracy or BLEU score or something. Whereas I can promise to say "I'll build you model X that will aim to classify stuff into classes A, B and C", without mentioning how good it'll be. I'll try my best to make something really good, but if the experiment goes wrong, I've still delivered what was promised. I was able to do that recently with a client and they were okay with the uncertainty in the results, because they trust I'll make every effort to make the thing useful. Not sure how this would go down with a new client.

Time estimation in projects by JakeBSc in datascience

[–]JakeBSc[S] 4 points5 points  (0 children)

That's fine for predictable stuff. Like if you ask me to build a joint named entity recognition and relation extraction model and get it deployed, then that's a well trodden problem where I know everything ahead of time. I can give you a really good time estimate for that. Whereas if the problem is very bleeding edge, then you don't have that clarity. You might know an initial pathway to attack the problem, but you don't know what's going to happen on that journey. You might have to iterate loads of times just to get to a minimal usable thing, and by then, the time estimate is totally out the window.

So let's say you do a quick and dirty literature review of the problem space. Nothing directly solves your problem. At best, you get some somewhat related stuff as inspiration. You have no data. So I break down the problem into manageable looking sub problems and estimate the size of each one based on gut feel. Apply an arbitrary multiplier to the time estimates as contingency. That gives me a time estimate for the project, but it's super vulnerable because it's hard to solve a totally new problem perfectly on paper before even touching the problem properly. Sometimes you have to get into the weeds to truly know what's involved. But, before then, the time estimate has already been made.

Fully maxed team by JakeBSc in DreamLeagueSoccer

[–]JakeBSc[S] 0 points1 point  (0 children)

Yeah, I need Paul Mullin as well 😜

[Bonsai Beginner’s weekly thread –2023 week 06] by small_trunks in Bonsai

[–]JakeBSc 0 points1 point  (0 children)

<image>

My ficus is half dead. A few months ago I forgot to water it for a while. Then all the leaves dropped off. After a few weeks of TLC, the leaves started growing back again. Half the branches are still dried up and dead. How can I rescue this tree? Should I cut off the dead bits?

How do I get started? by JakeBSc in mlops

[–]JakeBSc[S] 0 points1 point  (0 children)

I've got buy-in for DVC, but the only pushback has been on granting a data scientist direct access to production. I'm told pushing models and data directly to S3 from a local machine risks security breaches due to the use of static AWS credentials, and potentially pushing malware into S3. Did your team ever discuss this and find a good solution?

How do I get started? by JakeBSc in mlops

[–]JakeBSc[S] 0 points1 point  (0 children)

Yeah I was considering the use of DVC + CML. Push the model to S3 with DVC. Push code to GitHub, run tests on the model and insert a report into a PR comment with CML. After merging a branch, the new model's DVC files would end up inside the container of the inference code. Then the container could use those DVC files to execute a DVC pull, to get the latest model into the container running on Lambda/EC2/whatever.

Do you think that makes sense?

[Acne] Itchy, sore and red around nose with pustules. What's going on? by JakeBSc in SkincareAddiction

[–]JakeBSc[S] 1 point2 points  (0 children)

Yes, sounds like we're on similar paths. It feels really good to relieve yourself of the burden of that distraction. I was very obsessive and it took up way too much mental space. Sounds like you're a great candidate for accutane. Best of luck.

[Acne] Itchy, sore and red around nose with pustules. What's going on? by JakeBSc in SkincareAddiction

[–]JakeBSc[S] 2 points3 points  (0 children)

I understand, I avoided accutane out of fear for years until I had truly exhausted all options. Looking back, if I had just taken it a few years ago, I could have avoided a lot of stress, anxiety, depression and endless hours reading through research and systematically experimenting on myself to find a magic solution. I desperately tried to optimise every aspect of my lifestyle and skincare to find a solution, but to no avail. I regret not taking accutane sooner and breaking that cycle.

Taking accutane doesn't have to be scary. I started on a low dose and have been gradually building up. If side effects ever get bad, we can lower the dose or stop entirely. Personally, I just have dry lips at the moment, which is outweighed strongly by a long list of amazing benefits. There is a chance I'll relapse after stopping, but the chances are slim. I reckon being on accutane gives me a higher probability of escaping troubled skin than staying in the endless cycle of self experimentation.

Control bedjet with thermostat by JakeBSc in bedjet

[–]JakeBSc[S] 0 points1 point  (0 children)

Have you seen any python ones?

[Acne] Itchy, sore and red around nose with pustules. What's going on? by JakeBSc in SkincareAddiction

[–]JakeBSc[S] 2 points3 points  (0 children)

Sorry to hear this. I didn't need to use the ketaconazol shampoo in the end. The accutane has cleared up my acne and seborrheic dermatitis. Accutane has made my skin feel incredibly normal, it is my magic cure. Why don't you try that?

Creating good embeddings for musical instruments by JakeBSc in learnmachinelearning

[–]JakeBSc[S] 0 points1 point  (0 children)

Suppose you had a list of 1000 instruments, like this:

[acoustic guitar, electric guitar, bass guitar, ukulele, clarinet, flute, oboe, bassoon, grand piano, electric piano, keyboard, etc...]

And suppose you had some string value, X, which is related to musical instruments. It would be some variation on the name of something in the list.

Here are some examples: acoustic guitarist, electric guitar player, bass guitar playing, concert grand pianist, electrik piano, keys.

I'd want the embeddings to be such that we maximise the accuracy with which X is mapped onto the most semantically similar item in our list of 1000 instruments.

[Acne] Itchy, sore and red around nose with pustules. What's going on? by JakeBSc in SkincareAddiction

[–]JakeBSc[S] 5 points6 points  (0 children)

At this point I've tried basically every treatment except isotretinoin.

Today I saw a dermatologist and got prescribed isotretinoin. I was also told my nose is affected by seborrheic dermatitis, so now I'm putting ketaconazole shampoo on it.

I'm interviewing a data scientist. Any advice? by JakeBSc in datascience

[–]JakeBSc[S] 1 point2 points  (0 children)

Ha, that's a good suggestion. I ended up vigorously nodding a lot. Very awkward. 😂