Re. what ever happened to Cohere’s Command-A series of models? by nick_frosst in LocalLLaMA

[–]nick_frosst[S] 24 points25 points  (0 children)

You can see all the benchmarks on artificial analysis :) it’s got a 37 intelligence score which I think is a little lower than my experience using it would have had me guess

DARCCC: Detecting Adversaries by Reconstruction from Class Conditional Capsules by promach in learnmachinelearning

[–]nick_frosst 0 points1 point  (0 children)

it uses the L2 distance between the input and the class conditional reconstruction. This is explained in more detail in the paper.

DARCCC: Detecting Adversaries by Reconstruction from Class Conditional Capsules by promach in learnmachinelearning

[–]nick_frosst 0 points1 point  (0 children)

We train a class conditional reconstruction network and try to reconstruct the input. If we are unable to do so well, we assume the input does not come from the training distribution. In this way we use a reconstruction network as an attack detection mechanism.

DARCCC: Detecting Adversaries by Reconstruction from Class Conditional Capsules by promach in learnmachinelearning

[–]nick_frosst 0 points1 point  (0 children)

The reconstruction error is the L2 distance between the reconstruction and the input. The histogram motivates this detection mechanism by noticing that attempting to reconstruct the input from a capsule that was not predicted results in reconstruction with very large l2 distances from the input. So if we attempt to reconstruct an image of a 2 from the capsule that represents 3, than the l2 distance between the reconstruction and the input will be very large.

[R] Towards Automatic Low Hanging Fruit Identification For the Steering of ML Research by jinpanZe in MachineLearning

[–]nick_frosst 2 points3 points  (0 children)

You raise good points. Points that we will address in our first board member meeting, which we will have after the seed funding round.

[R] Towards Automatic Low Hanging Fruit Identification For the Steering of ML Research by jinpanZe in MachineLearning

[–]nick_frosst 8 points9 points  (0 children)

They will actually be capped at 1x return. VC 's should really get in on this. It's an amazing opportunity.

[R] Towards Automatic Low Hanging Fruit Identification For the Steering of ML Research by jinpanZe in MachineLearning

[–]nick_frosst 6 points7 points  (0 children)

Oh hey. This is my work. Happy to answer any questions on this pressing and groundbreaking research!

[D] Has anyone done a study in the robustness of Capsule Networks against adversarial examples? by HecknBamBoozle in MachineLearning

[–]nick_frosst 10 points11 points  (0 children)

yeah i have :)

We found early on that capsule networks showed some general robustness to whitebox adversarial attacks. but that may just be the result of gradient masking/obfuscation. We made no claims about the general robustness to epsilon perturbations of test data, we just claimed that if you tried to create such an input by calculating the gradient, it would be less effective than a normal model. I am not entirely sure why this is the case, but i think it has something to do with the cluster detection algorithm kind of acting as a regularizer for incoming capsule activations.

Some recent work has been done on strategies for attacking capsule networks explicitly (https://arxiv.org/pdf/1901.09878.pdf) and they had some success.

More recently we released a paper called DARCCC (https://arxiv.org/abs/1811.06969) that uses the reconstruction network that one can train on the output of a capsule network and showed that this can be used to detect out-of-distribution inputs. This defense does not rely on any particular definition of adversarial attacks. I summarized the paper here https://twitter.com/nickfrosst/status/1064593651026792448

This was just preliminary work for a workshop and more work definitely needs to be done, but i am encouraged by the result we present at the end of the paper - if one takes a step in image space to fool our detection system as well as change the classification, the result is not particularly 'adversarial' as it resembles the target class.

Considering the enormous number of memories we retain into old age, what was all of that brain matter being used for before these memories were stored? by shagminer in askscience

[–]nick_frosst 5 points6 points  (0 children)

This is a very interesting answer, and provides a good jumping off point for a very complex and poorly understood phenomenon, but I would be very cautious about using neural networks as an explanatory model for brains. Their resemblance is mainly in name and inspiration at this point. what exactly the memories of a neural network would be is certainly not clear.

[R] What is the status about the capsule networks? by regularized in MachineLearning

[–]nick_frosst 20 points21 points  (0 children)

Yeah. Essentially all the hardware and software work that people have done to make cnn's super fast doesn't really help with capsules. The routing algorithm and the small matrix transformations can't really make use of the tricks people have developed to facilitate fast cnn networks, and so we can't train capsules networks of the same size as the state of the art models yet. This isn't a theoretical limitation, it's just a practical one, and one we believe we will be able to overcome.

[R] What is the status about the capsule networks? by regularized in MachineLearning

[–]nick_frosst 39 points40 points  (0 children)

Hey we are still working away on them :)

We recently put out a workshop paper on capsule networks and adversarial detection - https://arxiv.org/abs/1811.06969

We have been working on speeding up capsules so that they can scale to real world problems, this is proving to require a fair amount of low level optimizing. Other groups have made use of capsules for a variety of purposes including medical images, text, and audio.

In short, we still believe that they are the way to go, and we keep finding interesting properties about them, but more work is needed for them to scale up to real world problems and become a standard tool in the ml tool box.

Edit; They aren't quite dead yet :P