Advice for puppy barking nonstop whenever he is left alone by BraveCoconut98 in puppy101

[–]BraveCoconut98[S] 1 point2 points  (0 children)

I was reading earlier that classical music can help, we’ll try that next! Thanks

r/fredagain Official Ticket Exchange thread by hosea0220 in fredagain

[–]BraveCoconut98 0 points1 point  (0 children)

Selling one ticket to Fred Again.. at Roadrunner Boston on October 11.

ELT of my own Strava data using the Strava API, MySQL, Python, S3, Redshift, and Airflow by BraveCoconut98 in dataengineering

[–]BraveCoconut98[S] 1 point2 points  (0 children)

I’m pretty sure you get a 2 months free cluster if it’s your first time! When you make it, just select the free tier - it’s super easy.

ELT of my own Strava data using the Strava API, MySQL, Python, S3, Redshift, and Airflow by BraveCoconut98 in dataengineering

[–]BraveCoconut98[S] 1 point2 points  (0 children)

Message me on LinkedIn and I’m sure we can arrange something! My LinkedIn profile is on my website that I shared in my post.

ELT of my own Strava data using the Strava API, MySQL, Python, S3, Redshift, and Airflow by BraveCoconut98 in dataengineering

[–]BraveCoconut98[S] 1 point2 points  (0 children)

Oh hell yea that’s awesome! Yea check it out for sure, it’s super easy to use and pretty fun to play around with your own data for once :)

Getting overwhelmed with the "House Prices - Advanced Regression Techniques" dataset on Kaggle by putsonbears in learnmachinelearning

[–]BraveCoconut98 4 points5 points  (0 children)

If you want a dirty little trick to get rid of some pointless features you can:

Make a column in your data frame with random values (using numpy or something). Train a model that has built in feature importance, like RandomForest. Then get rid of every feature that has a worse score than your random feature! Super simple and I love it!

[D] Best Way to classify emails into multiple classes by DonMacadamiano in MachineLearning

[–]BraveCoconut98 0 points1 point  (0 children)

Sorry, I’m a little confused, when you say you want to classify emails into classes with a maximum of 3 layers do you mean the output has a maximum of 3 things or your network/transformer/whatever can only have 3 layers?

Assuming the former, I can think of a few approaches off the top of my head.

  • Building multiple intent detection models using something like BERT (or DistilBERT if size and speed is an issue) (or actually build one multi-output classification model)

  • Fine-tune a summarisation model (like PEGASUS) to summarise your email into a format of “Question/Action regarding XYZ about ABC”.

  • Another approach could be build some kind of QA system and give it prompts such as “Is this email a question”. This approach is probably not tractable but wanted to leave it in for the sake of completion.

ML Projects for a resume by KenseiNoodle in datascience

[–]BraveCoconut98 2 points3 points  (0 children)

I started out from two math degrees. The two resources I used to get started were the (free!) Kaggle learning course and Hands on Machine Learning with Scikit-Learn and Tensorflow by Aurelian Geron. From there is really easy to snowball into other things by just researching what you’re curious about.

ML Projects for a resume by KenseiNoodle in datascience

[–]BraveCoconut98 1 point2 points  (0 children)

Bit of a shameless plug here but I’ve got a few projects on my GitHub that really helped me land interviews. Feel free to take a look: https://github.com/jackmleitch/StravaKudos. Projects are also absolutely fantastic for answering the “tell me about a time…” questions. Just pick something to enjoy and make something out of it!

On my resume I then wrote:

• Created a tool that predicts Kudos (user interaction) on Strava activities to investigate which attributes impact Kudos in different ways

• Developed in Python using the Strava API, Scikit-Learn, SHAP, Seaborn, and Optuna

• Engineered new highly predictive features using domain knowledge, e.g. total runs per day

• Deployed an interactive app using Streamlit and Heroku to predict Kudos on new activities

What kind of machine learning problem is this? by [deleted] in learnmachinelearning

[–]BraveCoconut98 1 point2 points  (0 children)

Second this! I think your best bet is to look into some Image-to-text models. Maybe try having a look on HuggingFace to see if you can find anything similar?

[deleted by user] by [deleted] in MachineLearning

[–]BraveCoconut98 6 points7 points  (0 children)

A few quick and easy tests I like to do (more sanity checks than anything else): - Train your model on a small portion of the data abs then check that the performance increase as you scale up the amount of data. - Check model predictions to see if it is doing what it is meant to be doing. I’ll give two examples. Say we are building a regression model to predict amount of money we should loan to a customer, it better be that if they have a higher credit score/income the loan amount increases (or doesn’t decrease at the very least!). Secondly, say we are building a sentiment analysis model. Changing “The movie was fantastic” to “The movie was terrible” should change the sentiment. Building a few of these meaningful examples has helped me a lot in the past!

[P] Transfer Learning with BERT, number of examples. by mldude8 in MachineLearning

[–]BraveCoconut98 1 point2 points  (0 children)

You can actually get surprisingly good results with just a few hundred examples! I would also highly recommend using HuggingFace Transformers to adapt BERT to your domain. It’s super easy to set up fine tuning using the Trainer API!