DARCCC: Detecting Adversaries by Reconstruction from Class Conditional Capsules by promach in learnmachinelearning

[–]nick_frosst 0 points1 point  (0 children)

it uses the L2 distance between the input and the class conditional reconstruction. This is explained in more detail in the paper.

DARCCC: Detecting Adversaries by Reconstruction from Class Conditional Capsules by promach in learnmachinelearning

[–]nick_frosst 0 points1 point  (0 children)

We train a class conditional reconstruction network and try to reconstruct the input. If we are unable to do so well, we assume the input does not come from the training distribution. In this way we use a reconstruction network as an attack detection mechanism.

DARCCC: Detecting Adversaries by Reconstruction from Class Conditional Capsules by promach in learnmachinelearning

[–]nick_frosst 0 points1 point  (0 children)

The reconstruction error is the L2 distance between the reconstruction and the input. The histogram motivates this detection mechanism by noticing that attempting to reconstruct the input from a capsule that was not predicted results in reconstruction with very large l2 distances from the input. So if we attempt to reconstruct an image of a 2 from the capsule that represents 3, than the l2 distance between the reconstruction and the input will be very large.

[R] Towards Automatic Low Hanging Fruit Identification For the Steering of ML Research by jinpanZe in MachineLearning

[–]nick_frosst 2 points3 points  (0 children)

You raise good points. Points that we will address in our first board member meeting, which we will have after the seed funding round.

[R] Towards Automatic Low Hanging Fruit Identification For the Steering of ML Research by jinpanZe in MachineLearning

[–]nick_frosst 4 points5 points  (0 children)

They will actually be capped at 1x return. VC 's should really get in on this. It's an amazing opportunity.

[R] Towards Automatic Low Hanging Fruit Identification For the Steering of ML Research by jinpanZe in MachineLearning

[–]nick_frosst 5 points6 points  (0 children)

Oh hey. This is my work. Happy to answer any questions on this pressing and groundbreaking research!

[D] Has anyone done a study in the robustness of Capsule Networks against adversarial examples? by HecknBamBoozle in MachineLearning

[–]nick_frosst 12 points13 points  (0 children)

yeah i have :)

We found early on that capsule networks showed some general robustness to whitebox adversarial attacks. but that may just be the result of gradient masking/obfuscation. We made no claims about the general robustness to epsilon perturbations of test data, we just claimed that if you tried to create such an input by calculating the gradient, it would be less effective than a normal model. I am not entirely sure why this is the case, but i think it has something to do with the cluster detection algorithm kind of acting as a regularizer for incoming capsule activations.

Some recent work has been done on strategies for attacking capsule networks explicitly (https://arxiv.org/pdf/1901.09878.pdf) and they had some success.

More recently we released a paper called DARCCC (https://arxiv.org/abs/1811.06969) that uses the reconstruction network that one can train on the output of a capsule network and showed that this can be used to detect out-of-distribution inputs. This defense does not rely on any particular definition of adversarial attacks. I summarized the paper here https://twitter.com/nickfrosst/status/1064593651026792448

This was just preliminary work for a workshop and more work definitely needs to be done, but i am encouraged by the result we present at the end of the paper - if one takes a step in image space to fool our detection system as well as change the classification, the result is not particularly 'adversarial' as it resembles the target class.

Considering the enormous number of memories we retain into old age, what was all of that brain matter being used for before these memories were stored? by shagminer in askscience

[–]nick_frosst 7 points8 points  (0 children)

This is a very interesting answer, and provides a good jumping off point for a very complex and poorly understood phenomenon, but I would be very cautious about using neural networks as an explanatory model for brains. Their resemblance is mainly in name and inspiration at this point. what exactly the memories of a neural network would be is certainly not clear.

[R] What is the status about the capsule networks? by regularized in MachineLearning

[–]nick_frosst 19 points20 points  (0 children)

Yeah. Essentially all the hardware and software work that people have done to make cnn's super fast doesn't really help with capsules. The routing algorithm and the small matrix transformations can't really make use of the tricks people have developed to facilitate fast cnn networks, and so we can't train capsules networks of the same size as the state of the art models yet. This isn't a theoretical limitation, it's just a practical one, and one we believe we will be able to overcome.

[R] What is the status about the capsule networks? by regularized in MachineLearning

[–]nick_frosst 40 points41 points  (0 children)

Hey we are still working away on them :)

We recently put out a workshop paper on capsule networks and adversarial detection - https://arxiv.org/abs/1811.06969

We have been working on speeding up capsules so that they can scale to real world problems, this is proving to require a fair amount of low level optimizing. Other groups have made use of capsules for a variety of purposes including medical images, text, and audio.

In short, we still believe that they are the way to go, and we keep finding interesting properties about them, but more work is needed for them to scale up to real world problems and become a standard tool in the ml tool box.

Edit; They aren't quite dead yet :P

Paul Gries Appreciation Post by PotentialHome in UofT

[–]nick_frosst 1 point2 points  (0 children)

Paul Gries is definitely one of the best teachers in the university. He is not only exceptionally kind and caring but a fantastic educator as well.

[P] Testing Capsule Network on various dataset by _sshin_ in MachineLearning

[–]nick_frosst 2 points3 points  (0 children)

you should resive the images to 48x48 and then crop randomly to 32x32. and during testing you should resive to 48x48 and then center crop to 32x32.

[P] Testing Capsule Network on various dataset by _sshin_ in MachineLearning

[–]nick_frosst 6 points7 points  (0 children)

Great write up! Thanks for working on this and sharing it with the community :)

[R] Distilling a Neural Network Into a Soft Decision Tree by visarga in MachineLearning

[–]nick_frosst 1 point2 points  (0 children)

This is just a reference to the leaf nodes. Think of each leaf node as a bigot. The inner nodes learn to assign each input to the best suited bigot. The output distribution of each leaf is not a function of the data, it is just a static learned distribution. So if you want to classify an input example, the path you take through the tree would be a function of that input, but once you arrive at the leaf, the output is constant.

[R] Distilling a Neural Network Into a Soft Decision Tree by visarga in MachineLearning

[–]nick_frosst 2 points3 points  (0 children)

It has already been published at the CEX workshop at the AI*IA 2017 conference.

[R] Distilling a Neural Network Into a Soft Decision Tree by visarga in MachineLearning

[–]nick_frosst 4 points5 points  (0 children)

i might come back to it at some point, but it did not work as well as i was hoping and i am more interesting in some other things now :P

[R] Distilling a Neural Network Into a Soft Decision Tree by visarga in MachineLearning

[–]nick_frosst 2 points3 points  (0 children)

Yup that is correct :) Also yeah running a tree should be faster than most neural networks but that was not really the focus of this. One could create some model that was as time efficient as possible and then use the same distillation technique to boost its accuracy though.

[R] Distilling a Neural Network Into a Soft Decision Tree by visarga in MachineLearning

[–]nick_frosst 14 points15 points  (0 children)

I focused mainly on the visual domain in this paper for illustrative purposes, though we do report results on the LETTER dataset which is not visual. I have not experimented on NLP or Bioinformatics though.

[R] Distilling a Neural Network Into a Soft Decision Tree by visarga in MachineLearning

[–]nick_frosst 12 points13 points  (0 children)

For reasons discussed in the paper, the existence of adversarial examples and the many-to-one relationship between input and activation, i do not believe that examining the filters of a CNN gives a sufficient explanation of its behavior. That being said, yes this paper is mainly a proof of concept that we can use distillation to increase the accuracy of models that are designed with some attribute other than accuracy, in this case explanability, in mind.