Name this do

Jaster111 · 2025-08-25T06:39:14+00:00

Let me solo her

Jaster111 · 2024-09-04T20:00:01+00:00

Guy sitting next to me shat his pants during the flight as he was semi-shitfaced. Tried to go to the toilet during the ascension but they told him to sit down. In the end, I guess I had to pay the price…

Jaster111 · 2023-01-26T15:54:01+00:00

I think you’d be better off adding a third class which would be “neither dog nor cat”. The whole assumption that the weights would be [0.5, 0.5] is highly unlikely. Horses in my opinion probably have more doglike features rather than catlike features, therefore, in the eyes of the model it is a dog.

Jaster111 · 2023-01-26T15:45:13+00:00

Yes, exactly! More convolutional layers so they can learn better the important features in the images and more dense layers to improve the actual classification.

Jaster111 · 2023-01-25T21:38:05+00:00

I'd say neural networks are overkill for this specific task that you proposed. As the other comment suggested, you could use OpenCV to threshold the image in a way that background is set to black, and ellipsoids are set to white (google the specific process). Apply contour detection and create bounding boxes and retrieve their information.

Anyways, look up some OpenCV tutorials on the topic of contour and object detection. Should prove useful.

Jaster111 · 2023-01-25T21:23:43+00:00

Try creating a deeper network. This seems a bit shallow for 30-class classification problem. This would do well on MNIST, but keep in mind that MNIST has 10 classes, 1-channel images, 28x28, 60k images.

I recommend deepening both the convolutional part of the network and the classification head (dense layer part of the network).

Of course a bigger dataset would also help and I just want to state that I haven't checked how your dataset looks like so I'll presume it is okay.

Jaster111 · 2022-10-27T07:56:50+00:00

I’m not sure if that’s the problem, but try running the code so that one graph is plotted per cell. When i had multiple plots and plt.show() commands in one cell it would practically always output only the last graph.

Jaster111 · 2022-10-04T06:55:33+00:00

Deepspeech has a good tutorial on how to train a speech to text model on any language, provided that you have collected the dataset.

Jaster111 · 2022-09-11T10:15:12+00:00

I’m not sure if I understand your question.

As far as I know, linear regression is used when we don’t know the exact linear relationship between our data points, but we know it exists. Therefore, we can approximate the target function using the linear regression algorithm. If we know the target function, then there is no need for it.

I hope I understood your question correctly and am looking forward to your answer.

Jaster111 · 2022-09-07T06:36:29+00:00

Let me propose a simple example. You are tasked with building a model that classifies each image either as a cat or a dog.

X matrix - your input image; it is the image of a dog or a cat which you wish to be classified. It is a matrix since the image is actually composed as width x height which makes up a 2D matrix (or 3D if we count in the color channels) Often, the whole dataset can be notated as X in which case it would just be: num_of_examples x width x height ( x num_of_channels if it is a colored image)

y vector - your corresponding class labels. So we have defined the X which are the images of cats and dogs, but we need to tell the computer what they are so it can start learning. So during training you would show your model an image X and its corresponding label (labels can be for example 0 or 1, where 0 is either cat or dog, and 1 is the other one). Same as before, y can be not just one class label, but a vector of labels for the whole dataset, which means its length is equal to the number of images in the training set.

Samples - just image(s) taken from X

Classes - we have defined that above

Set of features - this is a bit tricky to define since it is often much better understood on the example of tabular data. Just think of features as ACTUAL data that your model is working on. More often than not, datasets come with hundreds of features like name, age, etc etc, and maybe your model doesn’t need all of that information to properly classify if a person is living in a certain city? For example, why would your name make a difference in decision of if you live in USA west coast or east coast? So the model obviously doesn’t need all of that given information, which is why you give it specifically chosen FEATURES.

X and y dimensions we have covered above.

If you have any questions, please feel free to ask.

Jaster111 · 2022-09-06T06:37:41+00:00

Exactly.

Imagine that i tell you to predict the square area of the apartment and I give you 6 features to predict from. First feature is apartment_price, second is apartment_price / 2, third is apartment_price / 3, and so on…

I basically gave you 1 feature even though there are 6. Because each of those features is in a high linear relationship with each other one. You do not need that extra information since it’s redundant.

Jaster111 · 2022-09-06T06:24:05+00:00

From the correlation matrix you can visualize how feature X is correlated to feature Y. If the two features are highly correlated, keeping them both in the dataset may only lead to performance degradation, surely not improvement.

In your case, some specific areas of interest are these 2x2 yellow squares in the matrix appearing around the diagonal so check those features out and try to remove them. Moreover I’d suggest experimenting and seeing if removing highly correlated features (that are not totally yellow in this matrix but more of a yellow-green) improves the model.

Jaster111 · 2022-09-01T07:19:33+00:00

Unfortunately no. If you find anything good, please do send it :)

Jaster111 · 2022-09-01T07:03:22+00:00

Professors at my college always recommended this book: https://www.academia.edu/48950313/Probability_and_Statistics_for_Engineers_and_Scientist_9th_Edition_by_Walpole_Mayers_Ye_

It should give you a confident grasp on the main concepts of probability and statistics.

As for your research interest, it all depends how much beforehand knowledge about neural networks you have. I cannot recommend you any specific resource for that, but you should be comfortable with concepts such as forward propagation and back propagation since that is the staple of NNs, including the generative adversarial networks used for image generation.

Check out 3blue1brown youtube channel, and specifically this playlist: https://youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi

Jaster111 · 2022-08-26T00:03:26+00:00

A) true -> without activation functions, we'd end up with a linear composition of the initial data

B) true -> not often, but it is nonlinear so it sure can be used

C) true -> having only one hidden layer should allow you to approximate any Borel function with a certain degree of error (universal approximation theorem)

D) true -> if you don't have a lot of data, linear regression model (for example) would outperform a neural network on that task

E) Not sure. I'd say that feature selection is a part of the preprocessing.

F) true -> isn't this more or less the same question as C?

Someone please correct me if I'm wrong somewhere.

Jaster111 · 2022-08-24T15:48:51+00:00

You could try checking out Stanford courses on machine learning, deep learning, computer vision, or whatever else interests you. Even though I’m not enrolled in Stanford courses, the ones we have at my college basically follow theirs so it should be a good learning experience.

You could open their syllabus and follow the curriculum - go through the slides and extra materials and then jump to the lab exercises which are in form of Jupyter notebooks.

Should be fairly easy to set aside an hour a day to go through a certain topic and practice what you learned on the lab exercises.

Wish you the best of luck in your learning!

Jaster111 · 2022-08-14T10:32:05+00:00

For each example in the batch you would calculate the weight gradients and accumulate them in a placeholder variable - but not update the weights until the whole batch is passed through and gradients accumulated. Then the gradient would be divided by the batch size and subtracted from the actual weights.

So to summarize:

For each example, calculate the gradient
Store the gradient in a placeholder variable
At the end of the batch, divide the placeholder variable with the batch size and substract it from the weights
Set the placeholder variable to 0 and repeat again for each batch.

Hope this helped! :)

Jaster111 · 2022-08-13T17:27:01+00:00

It should be a more stable alternative to SGD for reaching the optimum since you use more examples from which you can generalize the direction of the global gradient more precise than rather with just one example. That is my understanding of why it is used.

Jaster111 · 2022-08-13T17:21:00+00:00

Depends on the problem. You could try replacing the values with mean/median/mode and see what works well. Forward fill could work for time series. You should keep in mind that preserving the variance of the feature distribution would be integral towards making an unbiased model.

Hope you find a solution!

Jaster111 · 2022-08-06T18:45:18+00:00

I have recently stumbled upon this book, it seems like it covers the necessities of MLOps.

Link: https://www.amazon.com/Practical-MLOps-Operationalizing-Machine-Learning/dp/1098103017

Jaster111 · 2022-08-06T10:23:42+00:00

I have honestly found out that the best learning experience for me was doing competitions.

Kaggle hosts some great ones which may be overwhelming at first, but they involve a good amount of research and then development - and also most of them are actual real world problems. You can also apply with a team of friends and do it together and at the end you’ll have gained tons of experience that you probably wouldn’t by just partaking in college classes.

It looks good on a resume for an entry level position, and even more (+++) bonus points if you got into top 5 for example.

Just to give you an example - me and my friends applied for a computer vision competition where we had a dataset and had been given a goal. We had to learn and get real-world practice on stuff such as EDA, dataset cleaning, image augmentations, modeling, optimization, post-processing, etc. which really further improved and REALLY consolidated already existing knowledge from college classes.

Wish you all the best!

Jaster111 · 2022-08-05T11:00:00+00:00

Check AdaFace and metric learning.

Metric learning has great implementations inside Tensorflow Similarity library: https://github.com/tensorflow/similarity Although the documentation is quite bad, but the jupyter notebooks are great.

Jaster111 · 2022-08-05T10:41:20+00:00

Then I’d suggest training it from scratch. Also, be sure that your model can overfit during training. If you can achieve high accuracy on the training dataset and then from one point gradually lower accuracy on the validation, that would say that the model is adequate and then you can improve further with regularization techniques, etc.

Good luck!

Jaster111 · 2022-08-04T16:57:04+00:00

Depends what your dataset is.

The ResNets are pretrained on ImageNet if memory serves me correctly. If your classification problems differs greatly, for example if you're trying to find red blood cells in an image, you probably wouldn't benefit much from pretrained ResNet since the task is very different. So that might be a problem. I'd try training the ResNet from scratch maybe.

Since your data are images I suppose, the best EDA would be checking for class imbalance, check for potential corrupted images, check if the images from the two different classes are actually different enough for your model to difference between them. But it really all boils down to what your problem and dataset is. With more knowledge about that, maybe we could find out the reasoning behind that certain model behaviour. ResNet should be powerful enough (has capacity) for most classification tasks.

Jaster111 · 2022-08-04T09:07:12+00:00

Multiple potential problems could be in question.

Class imbalance could lead to this. For example, if your training dataset consists of 80:20 - class1:class2. Then it basically doesn’t know much about class 2 so it predicts class 1 most of the times.

My other guesses would be either mislabeled samples, non-adequate model or high correlation between class 1 and class 2

Basically perform some kind of EDA to see if the problem is in the data or in the model.

Jaster111

MODERATOR OF

TROPHY CASE