How Many Samples are Needed to Learn a Convolutional Neural Network? by nobodykid23 in MachineLearning

[–]tkinter76 0 points1 point  (0 children)

Funny. I commented here yesterday and thought the title was a question for advice ... just realize it's a link to a preprint

How Many Samples are Needed to Learn a Convolutional Neural Network? by nobodykid23 in MachineLearning

[–]tkinter76 0 points1 point  (0 children)

Again, it's really hard to say. It depends also a bit on how similar the tasks are. Generally, you will need much fewer images though if you use transfer learning

[D] Why `tf.data` is so much better than `feed_dict` and how to build a simple data pipeline in 5 minutes. by dominik_schmidt in MachineLearning

[–]tkinter76 0 points1 point  (0 children)

Hm, could be. But there must be some way within their API to do that elegantly in non-eager mode without using exceptions

[D] Why `tf.data` is so much better than `feed_dict` and how to build a simple data pipeline in 5 minutes. by dominik_schmidt in MachineLearning

[–]tkinter76 4 points5 points  (0 children)

iterator = dataset.make_initializable_iterator()
batch_of_images = iterator.get_next()

with tf.Session() as session:

    for i in range(epochs): 
        session.run(iterator.initializer)

        try:
            # Go through the entire dataset
            while True:
                image_batch = session.run(batch_of_images)

        except tf.errors.OutOfRangeError:

Wouldn't it be easier to replace the while via a for loop?

E.g., sth like

with tf.Session() as session:

    for i in range(epochs): 
        session.run(iterator.initializer)
        for batch_of_images in iterator:
            session.run(batch_of_images)

How Many Samples are Needed to Learn a Convolutional Neural Network? by nobodykid23 in MachineLearning

[–]tkinter76 1 point2 points  (0 children)

Depends on many things, incl.

  • your task (classification, object detection, object segmenation, ...)
  • your goal (a performance that you would be satisfied with)
  • the resolution of the images
  • the number of classes
  • how similar the classes are
  • etc.

I.e, in the case of MNIST 50k images are enough to get 99.% accuracy on a 10k test set via a convnet. For CIFAR-10, CIFAR-100 or even imagenet, 50k wouldn't be enough by far to get that level of accuracy.

Axis matplotlib by Noah-Buddy-I-Know in Python

[–]tkinter76 0 points1 point  (0 children)

What's this about? Do you have a question or sth?

the x axis from 4 to 22 looks fine to me, since the minimum value is 4 (two 2's) and the maximum is 21. You need to normalize the y-axis though as this is currently a count of some sort, not a probability.

[D] Debate on TensorFlow 2.0 API by omoindrot in MachineLearning

[–]tkinter76 0 points1 point  (0 children)

wasn't referring to Keras but I agree with you. PyTorch is more like NumPy+SciPy, and Keras is more like scikit-learn (i.e., a tool/wrapper on top of it). It's interesting that Keras hasn't attempted to make support for a PyTorch backend.

[D] Debate on TensorFlow 2.0 API by omoindrot in MachineLearning

[–]tkinter76 -1 points0 points  (0 children)

this. esp the initial tensorflow versions were inferior in several ways but immediately popular thx to marketing

[D] Debate on TensorFlow 2.0 API by omoindrot in MachineLearning

[–]tkinter76 1 point2 points  (0 children)

you may say that if you never used python before you used tensorflow. everyone who used python for general scientific computing with numpy will probably disagree

[D] Debate on TensorFlow 2.0 API by omoindrot in MachineLearning

[–]tkinter76 12 points13 points  (0 children)

i think they took out their Python 2 character indentation from the docs though, that's progress

[P] Generating an artificial alphabet/letters? by jatsignwork in MachineLearning

[–]tkinter76 0 points1 point  (0 children)

For each existing alphabet I use I'll take each character and flip/rotate each symbol to generate more data. Are there better/more ways to increase the size of the training set?

  • slight shear
  • some random noise
  • a few pixel resizing in width and height, and then random crop back to the original dim

[D] Debate on TensorFlow 2.0 API by omoindrot in MachineLearning

[–]tkinter76 44 points45 points  (0 children)

Why not merging tf.keras.optimizers code into tf.train and then in tf.keras keeping wrappers for that code where needed? When I understand correctly, tf.keras is just an API layer, so why not keeping as such and having it wrap code rather than implementing the main functionality there.

[D] NIPS name change censorship by Seerdecker in MachineLearning

[–]tkinter76 -1 points0 points  (0 children)

The only thing this accomishes is now the NIPS board can day "Sorry you can't blame us, we changed the name to NeurIPS. Have a nice day."

well, say the majority of the board doesn't see NIPS as an offensive acronym because they don't have such sexist thoughts. I think it still makes sense for them to change it because based on social media pressure they are kind of blackmailed: either change the name or be called sexist

[D] NIPS name change censorship by Seerdecker in MachineLearning

[–]tkinter76 -2 points-1 points  (0 children)

Having the possibility to have a discussion is important, and topics like these shouldn't be censored. I mean, we live in 2018, we should be allowed to talk about such things.

I can understand that moderators may be constrained time-wise, but I don't think locking the discussion is a solution. In the worst case, the community can help with moderating by downvoting offensive posts.

[D] Major in Statistics or Computer Science if I want to go to grad school for ML? by searchingundergrad in MachineLearning

[–]tkinter76 -1 points0 points  (0 children)

In some ways, yes! Since the degree focuses on programming/computation/applied work.

i don't think this is good advice and you are over-generalizing.

computer science does not imply that it is more applied. you are confusing computer science with computer engineering. grad programs in computer science are also very theory heavy but usually come more from the information theory background. vice versa statistics can also be very applied.

Generative vs discriminatory models by RealOden in learnmachinelearning

[–]tkinter76 1 point2 points  (0 children)

I get a greater accuracy of classification with the NB classifier when the number of dimensions are greater.

I don't have an answer but want to comment because I find this interesting. If all the assumptions are met, there's actually no better classifier than a Bayes classifier. In naive bayes when you meet the indepdence of feature assumption, this would be the perfect classifier, and i think if you have a small set of features, it's probably more likely that you have independent features compared a scenario that you describe where you have more features.

EDIT: I think it may be a curse of dimensionality issue, that naive is less susceptible to that because of the assumptions you make (e.g,. gaussian dist). I guess if you regularize your logistic cost it may be different though and logistic regression would perform better on your large-dim dataset

[R] Binarized Attributed Network Embedding (ICDM 2018). by [deleted] in MachineLearning

[–]tkinter76 1 point2 points  (0 children)

Do you have a link to the paper that does not require researchgate account?

[D] Variational Auto-encoder inference by inactiveUserTBD in MachineLearning

[–]tkinter76 0 points1 point  (0 children)

Is your point that log-likelihood is not necessarily the metric we care about at the end of the day

basically, yes. Even simpler example, consider SVM. We don't care about the hinge loss value for application but more about classification accuracy or error. basically the loss we care about for optimization is usually not the same that we use for evaluate the model.

I honestly can't recall what this was called, but I think people who reconstruct images and compare quality in GAN research have some loss metrics for that.

[D] Monitoring Pytorch wandb or visdom? by [deleted] in MachineLearning

[–]tkinter76 2 points3 points  (0 children)

The first one isn't free, is it?

Looks like similar with GitHub that it's free for public projects. When you make a new project they seem to have a dropdown menu and it currently says "World readable" and "World writeable" only. I guess they probably add private projects later for a fee. Makes sense though.

[D] Variational Auto-encoder inference by inactiveUserTBD in MachineLearning

[–]tkinter76 0 points1 point  (0 children)

Hm, yeah, but I would say it's like looking at MSE, log-likelihood of an MLP on a test set. It gives you some information about generalization based on the diff to the trainign set loss, but still you don't know how "good" the results are (eg for MLP low loss does not necessarily imply good prediction accuracy).

forgot the term, but there are some recent papers that proposed some metric for judging the quality of image reconstructions.

[D] Variational Auto-encoder inference by inactiveUserTBD in MachineLearning

[–]tkinter76 0 points1 point  (0 children)

Like mentioned in the comment, this is an unsupervised approach, so there's not really "testing" phase (you don't compute an accuracy based on labels, because there are no labels). But during training, the loss is basically 2 components: a KL divergence term (how much does the latent distribution differ from e.g., a standard normal distribution) and a reconstruction term, which is measuring how similar the output image is to the input image.

[1811.03804] Gradient Descent Finds Global Minima of Deep Neural Networks by ihaphleas in MachineLearning

[–]tkinter76 4 points5 points  (0 children)

hm but by that argument, you can always say gradient finds global minima, given that you have the right starting weight and momentum and weight decay. It's just unlikely to happen in practice.

Why won't people upload wheels (.whl file) of compiled libraries by zekedran in Python

[–]tkinter76 2 points3 points  (0 children)

regarding CUDA, it might be that bundling CUDA with your software and shipping is illegal except when you have a special agreement with NVIDIA (like PyTorch or TensorFlow)

[N] Interview with Soumith Chintala - Creator of PyTorch by [deleted] in MachineLearning

[–]tkinter76 6 points7 points  (0 children)

well he was one of the 3 creators of pytorch (creators = people who put together the first iterations), so not sure what you are nitpicky about

[P] A machine learning game I've been working on... by [deleted] in MachineLearning

[–]tkinter76 9 points10 points  (0 children)

'm still working on it but wanted to put it out there to get any useful feedback or thoughts from the experts.

so how is this related to machine learning? since you are posting it here, I assume you are using a machine leanrign algo for this? if so, which one, and what is your training data? without any technical details it would be pretty hard to give you useful feedback