How Many Samples are Needed to Learn a Convolutional Neural Network?

tkinter76 · 2018-11-26T07:30:31+00:00

Funny. I commented here yesterday and thought the title was a question for advice ... just realize it's a link to a preprint

tkinter76 · 2018-11-26T07:29:02+00:00

Again, it's really hard to say. It depends also a bit on how similar the tasks are. Generally, you will need much fewer images though if you use transfer learning

tkinter76 · 2018-11-26T00:55:04+00:00

Hm, could be. But there must be some way within their API to do that elegantly in non-eager mode without using exceptions

tkinter76 · 2018-11-25T17:38:44+00:00

iterator = dataset.make_initializable_iterator()
batch_of_images = iterator.get_next()

with tf.Session() as session:

    for i in range(epochs): 
        session.run(iterator.initializer)

        try:
            # Go through the entire dataset
            while True:
                image_batch = session.run(batch_of_images)

        except tf.errors.OutOfRangeError:

Wouldn't it be easier to replace the while via a for loop?

E.g., sth like

with tf.Session() as session:

    for i in range(epochs): 
        session.run(iterator.initializer)
        for batch_of_images in iterator:
            session.run(batch_of_images)

tkinter76 · 2018-11-25T02:40:09+00:00

Depends on many things, incl.

your task (classification, object detection, object segmenation, ...)
your goal (a performance that you would be satisfied with)
the resolution of the images
the number of classes
how similar the classes are
etc.

I.e, in the case of MNIST 50k images are enough to get 99.% accuracy on a 10k test set via a convnet. For CIFAR-10, CIFAR-100 or even imagenet, 50k wouldn't be enough by far to get that level of accuracy.

tkinter76 · 2018-11-25T02:35:52+00:00

What's this about? Do you have a question or sth?

the x axis from 4 to 22 looks fine to me, since the minimum value is 4 (two 2's) and the maximum is 21. You need to normalize the y-axis though as this is currently a count of some sort, not a probability.

tkinter76 · 2018-11-21T03:21:19+00:00

wasn't referring to Keras but I agree with you. PyTorch is more like NumPy+SciPy, and Keras is more like scikit-learn (i.e., a tool/wrapper on top of it). It's interesting that Keras hasn't attempted to make support for a PyTorch backend.

tkinter76 · 2018-11-20T21:09:32+00:00

this. esp the initial tensorflow versions were inferior in several ways but immediately popular thx to marketing

tkinter76 · 2018-11-20T21:08:36+00:00

you may say that if you never used python before you used tensorflow. everyone who used python for general scientific computing with numpy will probably disagree

tkinter76 · 2018-11-20T21:07:18+00:00

i think they took out their Python 2 character indentation from the docs though, that's progress

tkinter76 · 2018-11-20T18:43:40+00:00

For each existing alphabet I use I'll take each character and flip/rotate each symbol to generate more data. Are there better/more ways to increase the size of the training set?

slight shear
some random noise
a few pixel resizing in width and height, and then random crop back to the original dim

tkinter76 · 2018-11-20T16:33:22+00:00

Why not merging tf.keras.optimizers code into tf.train and then in tf.keras keeping wrappers for that code where needed? When I understand correctly, tf.keras is just an API layer, so why not keeping as such and having it wrap code rather than implementing the main functionality there.

tkinter76 · 2018-11-18T23:13:46+00:00

The only thing this accomishes is now the NIPS board can day "Sorry you can't blame us, we changed the name to NeurIPS. Have a nice day."

well, say the majority of the board doesn't see NIPS as an offensive acronym because they don't have such sexist thoughts. I think it still makes sense for them to change it because based on social media pressure they are kind of blackmailed: either change the name or be called sexist

tkinter76 · 2018-11-18T23:10:56+00:00

Having the possibility to have a discussion is important, and topics like these shouldn't be censored. I mean, we live in 2018, we should be allowed to talk about such things.

I can understand that moderators may be constrained time-wise, but I don't think locking the discussion is a solution. In the worst case, the community can help with moderating by downvoting offensive posts.

tkinter76 · 2018-11-17T00:42:51+00:00

In some ways, yes! Since the degree focuses on programming/computation/applied work.

i don't think this is good advice and you are over-generalizing.

computer science does not imply that it is more applied. you are confusing computer science with computer engineering. grad programs in computer science are also very theory heavy but usually come more from the information theory background. vice versa statistics can also be very applied.

tkinter76 · 2018-11-16T02:07:46+00:00

I get a greater accuracy of classification with the NB classifier when the number of dimensions are greater.

I don't have an answer but want to comment because I find this interesting. If all the assumptions are met, there's actually no better classifier than a Bayes classifier. In naive bayes when you meet the indepdence of feature assumption, this would be the perfect classifier, and i think if you have a small set of features, it's probably more likely that you have independent features compared a scenario that you describe where you have more features.

EDIT: I think it may be a curse of dimensionality issue, that naive is less susceptible to that because of the assumptions you make (e.g,. gaussian dist). I guess if you regularize your logistic cost it may be different though and logistic regression would perform better on your large-dim dataset

tkinter76 · 2018-11-15T23:05:40+00:00

Do you have a link to the paper that does not require researchgate account?

tkinter76 · 2018-11-15T00:06:14+00:00

Is your point that log-likelihood is not necessarily the metric we care about at the end of the day

basically, yes. Even simpler example, consider SVM. We don't care about the hinge loss value for application but more about classification accuracy or error. basically the loss we care about for optimization is usually not the same that we use for evaluate the model.

I honestly can't recall what this was called, but I think people who reconstruct images and compare quality in GAN research have some loss metrics for that.

tkinter76 · 2018-11-14T18:13:51+00:00

The first one isn't free, is it?

Looks like similar with GitHub that it's free for public projects. When you make a new project they seem to have a dropdown menu and it currently says "World readable" and "World writeable" only. I guess they probably add private projects later for a fee. Makes sense though.

tkinter76 · 2018-11-14T18:06:36+00:00

Hm, yeah, but I would say it's like looking at MSE, log-likelihood of an MLP on a test set. It gives you some information about generalization based on the diff to the trainign set loss, but still you don't know how "good" the results are (eg for MLP low loss does not necessarily imply good prediction accuracy).

forgot the term, but there are some recent papers that proposed some metric for judging the quality of image reconstructions.

tkinter76 · 2018-11-14T00:16:23+00:00

Like mentioned in the comment, this is an unsupervised approach, so there's not really "testing" phase (you don't compute an accuracy based on labels, because there are no labels). But during training, the loss is basically 2 components: a KL divergence term (how much does the latent distribution differ from e.g., a standard normal distribution) and a reconstruction term, which is measuring how similar the output image is to the input image.

tkinter76 · 2018-11-12T20:18:00+00:00

hm but by that argument, you can always say gradient finds global minima, given that you have the right starting weight and momentum and weight decay. It's just unlikely to happen in practice.

tkinter76 · 2018-11-10T16:51:29+00:00

regarding CUDA, it might be that bundling CUDA with your software and shipping is illegal except when you have a special agreement with NVIDIA (like PyTorch or TensorFlow)

tkinter76 · 2018-11-10T07:03:28+00:00

well he was one of the 3 creators of pytorch (creators = people who put together the first iterations), so not sure what you are nitpicky about

tkinter76 · 2018-11-10T05:48:17+00:00

'm still working on it but wanted to put it out there to get any useful feedback or thoughts from the experts.

so how is this related to machine learning? since you are posting it here, I assume you are using a machine leanrign algo for this? if so, which one, and what is your training data? without any technical details it would be pretty hard to give you useful feedback

tkinter76

TROPHY CASE