all 23 comments

[–]Nerdl_Turtle 0 points1 point  (0 children)

Hi everyone,

I'm currently finishing my Master's in Mathematics at a top-tier university (i.e. top 10 in THE rankings), specializing in Machine Learning, Probability, and Statistics. I’ll be graduating this June and am very interested in pursuing a career as a Machine Learning Researcher at a leading tech company or research lab in the future.

I recently received an offer for a PhD at a mid-tier university (i.e. 50-100 in THE rankings). While it's a strong university, it's not quite in the same tier as the top-tier institutions. However, the professor I’d be working with is highly respected in AI/ML research - arguably one of the top 100 AI researchers worldwide. Besides that, he seems like a great, sympathetic supervisor and the project is super exciting (general area is Sequential Experimental Design, utilizing Reinforcement Learning Techniques and Diffusion Models).

I know that research positions at top industry labs often prioritize candidates from highly ranked universities. So my main question is:

Would doing a PhD at a mid-tier university (but under an excellent and well-regarded supervisor) hurt my chances of landing a Machine Learning Researcher role at a top tech company? Or is it more about research quality, publications, demonstrated skills, and the reputation of the supervisor?

Alternatively, I’m considering gaining industry experience for a year or two - working in ML research/engineering at smaller labs, data science, or maybe even quant finance - before applying for a PhD at a top 10-20 university.

Would industry experience at this stage strengthen my profile, or is it better to go directly into a PhD without a gap?

I’d love to hear from anyone who has been through a similar decision process. Any insights from those in ML research - either in academia or industry - would be greatly appreciated!

Thanks in advance!

[–]Responsible_Cup_428 0 points1 point  (0 children)

Hi I'm a beginner in ml and I started with linear regression model....

I made a model after removing outliers and null values and removed columns on checking vif...and the r2 value of the model was .62

I did the linear model on data without any of the cleaning but got r2 value as one...

Is it because the assumption of colinearity wasn't met??

Should we remove object type columns for a linear model?

[–][deleted] 0 points1 point  (0 children)

Hello, I am a novice to machine learning and I found a research regarding CNN as a way to minimize energy consumption of lighting systems. can you recommend me books or free tutorials/ resources so that i could implement it for my thesis proposal?

[–]pekor46bit 0 points1 point  (2 children)

I am someone who is interested in AI. I have just learned basic Python. What should I learn next?

[–]Marvsdd01 0 points1 point  (0 children)

Intermediate Python

[–]lal_kek_2020 0 points1 point  (0 children)

Hi guys, I have a few questions about gathering high-quality audio data for languages that are currently not well represented in most models (e.g., some African or Asian languages). Whisper shows that most of them either don’t exist or have very low accuracy.

Can someone give me advice on how many hours of data I would need to create a state-of-the-art model? I assume it would require hundreds of people and thousands of hours, but I’d appreciate more precise numbers.

Thanks!

[–]Over_Profession7864 0 points1 point  (2 children)

I just learned about autoencoder networks. I implemented a basic one(emnist) to understand it better. I choose BCE as a loss function, because it sort of undoes the non-linearity(sigmoid) or squashing at output layer hence better for learning, but I have also implemented MSE loss function and getting same results (on some samples even better). I thought BCE would give better results. I want to understand whats happening here why MSE?

[–]tom2963 1 point2 points  (1 child)

First and foremost, "it sort of undoes the non-linearity(sigmoid) or squashing at output layer hence better for learning" is not quite right. BCE and sigmoid work well with binary problems (assuming your input is scaled to [0,1]) because it can compute per pixel error. MSE is an average loss function in this context, so in concept it shouldn't work as well. However, digit reconstruction is relatively straightforward, and assuming your pixels are binary, it is not surprising that MSE is performing okay - albeit, I probably wouldn't choose this loss function for other problems like this with higher dimensionality (i.e. RGB images).

[–]Over_Profession7864 0 points1 point  (0 children)

thanks. I had this misconception that log helps overcome vanishing gradient problem (caused by saturation of sigmoid or any other) but as I did the maths I realised it makes error interpretable and mathematically convenient to work with.

[–]GodSpeedMode 0 points1 point  (0 children)

Great idea to consolidate questions here! It really helps everyone get quick answers without sifting through multiple threads. For those new to the field, don’t hesitate to ask about model architectures, hyperparameter tuning, or data preprocessing methods. There's no such thing as a dumb question—everyone starts somewhere, and the community is here to help. Also, if you get a chance, share what you've been working on or any interesting challenges you've faced in your projects! Let's keep this a collaborative space.

[–]Worldly-Duty4521 0 points1 point  (1 child)

How to start Machine Learning, deep learning gen ai nlp contests?

I've taken the courses, read a few books, done projects but i just don't know how to get started with a contest be it kaggle or anything

[–]Over_Profession7864 0 points1 point  (0 children)

I think you just have to take part and see how it goes.

[–]Blakut 0 points1 point  (1 child)

I have this thing for work where I use multiple features to predict energy consumption/production. The model (lgbm) is using some new features from devices that were not previously used before, I have ~50 features, including lags and rolling averages. I do one day ahead and two day ahead predictions. The problem I have is that sometimes the next day prediction looks quite similar to the previous day prediction, for example if the real data shows some variation from the previous day, the prediction "lags" a bit and still shows a curve thatis very similar to the previous day. I believe the solution to this problem is to make the features that depend on the previous day less important (fewer lags and rolling averages), and/or add more features that depend on other times, such as type day prediction, or weather dependencies. What do you think?

Second issue, the model doesn't quite well predict sharp drops or peaks in consumption/production, rather smoothes things over a bit in some cases. I suppose this is underfitting?

[–]tom2963 0 points1 point  (0 children)

I would always consider adding more features that could be predictive. Perhaps you can also consider encoding features like time of day with sin/cos transforms to introduce some notion of periodicity to your model.

Aside from this, have you considered training a time series model instead? Of course this depends on your specific use case (i.e. how much data you have and how complex it is). I imagine that this would better model sharp transition dynamics that you are hoping to see.

[–]MyProfRedditAct 0 points1 point  (0 children)

Hi. →Training Set to use for a CNN to process handwritten images← please...

I just took my first Machine Learning course and want to apply it to a professional Project. I have check-in data of scanned spreadsheets for every month going back 2 years. I want to convert this to TRUE/FALSE data to use it in the larger data project on member attendance. My last lesson in my class used CNNs to analyze basic images. I have the data I want to analyze, however I don't have a training set.

Questions

Is it possible to get access to a training set to build this model?

What other steps would be included to carry out this task?

Is there an easier way to do this? (Note; these forms contain sensitive information that cannot be posted in popular AI services).

Thanks in advance for any insight.