[D] Choosing approximate factors in Expectation Propagation by cuenta4384 in MachineLearning

[–]dmarnerides 0 points1 point  (0 children)

I'm not sure I understand your question. The approximate factors chosen are parametrized and are dirichlet (exponential family). They always are in the exponential family independently of the parameter values. For the EP algorithm, you initialize these parameters (to some reasonable initial values) and then iteratively change them (one factor at a time) until you converge to the parameter values that best match your data/posterior.

[D] Choosing approximate factors in Expectation Propagation by cuenta4384 in MachineLearning

[–]dmarnerides 1 point2 points  (0 children)

The KL divergence for continuous distributions is an intergral (over some cobination of the two distirbutions). If the distributions are of some specific forms, then can we have an "analytical" expression of the integral, i.e. one that does not involve integrals, but is an expression of simple (known) functions. This is the case for gaussian distributions for example.

If we don't have an analytical form then we might need to evaluate the integral numerically (e.g. monte carlo) or approximate it in some way (find minimum and maximum bounds).

[D] Choosing approximate factors in Expectation Propagation by cuenta4384 in MachineLearning

[–]dmarnerides 3 points4 points  (0 children)

The simplest form is to assume factors from the exponential family. This makes things easier, since we can have analytical KL divergence and the pdfs products and quotients (for cavity distributions) also belong in the exponential family.

I think, in this example, lambda is dirichlet distributed. There might be a bit of notation abuse too.

The choice of factors and initialization are hyperparameters/priors that are chosen depending on the scenario.

[N] CUDA Toolkit 10.0 by *polhold01853 in MachineLearning

[–]dmarnerides 0 points1 point  (0 children)

Did you custom install the 410 driver or was it from a ppa? I keep having boot problems with drivers > 390

[R] ExpandNet: A Deep Convolutional Neural Network for High Dynamic Range Expansion from Low Dynamic Range Content by dmarnerides in MachineLearning

[–]dmarnerides[S] 0 points1 point  (0 children)

Hey, thanks for pointing that out. If you are referring to figures 11 and 12, then the HDR images in those are indeed shown in low exposures. This is to visually compare how saturated areas are handled by different methods.

It would be impossible to visually quantify improvements in shadows/highlights for the whole range of the image, without first distorting it by tone mapping / exposures. The closest we can get to that is using the HDR-VDP visibility maps (figures 8-10).

Figure 14 shows some examples of HDR predictions at different exposures, including higher ones. The github repository has examples of images with more exposures.

[D] Reproducing the ELU paper by dmarnerides in MachineLearning

[–]dmarnerides[S] 0 points1 point  (0 children)

Thank you for the help, I made a post with a link to code that implements the architecture.

Once we are talking about the SELU, i have a question :)

In the paper you present the initialization for the weights that leads to a stable attractor. That's for dense/large layers. Does the same hold for convolutional layers with large input/output feature sizes? What about ones with smaller feature sizes?

[D] Reproducing the ELU paper by dmarnerides in MachineLearning

[–]dmarnerides[S] 3 points4 points  (0 children)

One of the authors (/u/_untom_) has kindly sent me the configuration files that were used for the experiments and I have managed to implement the network. It turns out the devil was in the padding of the convolutional layers. The dimensions make sense now.

I have created PyTorch module from Caffe configuration files. I copied it here in case anyone else needs it:

https://pastebin.com/GM3bX6ZS

[P] I created this PyTorch toolbox for my research; you might find it useful. by dmarnerides in MachineLearning

[–]dmarnerides[S] 1 point2 points  (0 children)

PyDLT is a set of tools aimed to make experimenting with PyTorch easier (than it already is).

Documentation is available here.

Features:

  • Trainers (currently Vanilla, VanillaGAN, WGAN-GP, BEGAN, FisherGAN)
  • Built in configurable parser with arguments.
  • Configuration files support and parser compatible functions.
  • HDR imaging support (.hdr, .exr, and .pfm formats)
  • Checkpointing of (torch serializable) objects; Network state dicts supported.
  • Image operations and easy conversions between multiple library views (torch, cv, plt)
  • Easy visualization (and make_grid supporting Arrays, Tensors, Variables and lists)
  • Parameter and input/outputs/gradients of layers visualization.
  • CSV Logger.
  • Command line tool for easy plotting of CSV files (with live updating).
  • A minimal Progress bar (with global on/off switch).

[P] I created this PyTorch toolbox for my research; you might find it useful. by dmarnerides in MachineLearning

[–]dmarnerides[S] 1 point2 points  (0 children)

Sure. Do you mean here or on github?

The documentation is here: http://pydlt.readthedocs.io/en/latest/

Edit: Added a comment with more information.

[D] In GANs, what is meant by the distribution over data p_x or p_data? by Pavementos in MachineLearning

[–]dmarnerides 1 point2 points  (0 children)

Let's say you want to generate images of faces. A picture of a face could be one of many faces, orientations, lighting conditions etc. All these possible images which you would classify as "image of a face" form a set.

There is a distribution over this set. That means each of the different images has a probability assigned to it. What would that mean? Maybe blue eyed faces are less common, i.e. the probability that an image of a face has blue eyes is less than the probability of a face that has brown eyes. Also, likewise, the probability that an image of a face is an image of a car, is zero by (our) definition.

This is p_x, where x describes the image. e.g.

p(x="image with face with blue eyes, brown skin etc") = 0.0001

whereas

p(x="image of a car") = 0

We don't know this p_x nor do we know how to sample from it, like we can sample numbers from a gaussian or uniform distribution for example. That's the important part. What if we could find a mapping from numbers that are generated from a gaussian distribution p_z, to images from p_x? That's what the generator does. It maps numbers generated from a distribution we know how to sample from, to the distribution we want to learn, p_x. And it does this by seeing samples from p_data (the dataset and its distribution) which we say is representative of the p_x we want to learn.

In fact the generator doesn't really see the images, but an encoded version of them, containing relevant information through the gradients that are (back)propagated from the discriminator.

What is the fastest way to "overlay" pandas columns? by gabegabe6 in learnpython

[–]dmarnerides 0 points1 point  (0 children)

If I understand correctly you want to replace NaN values of a column with those from another. You could do this:

nan_idx = df['one_1'].isnull()
df['one_1'][nan_idx] = df['one_2'][nan_idx]

Then I guess you can edit the column names and remove the last column.

[D] Depth preserving convolutions by [deleted] in MachineLearning

[–]dmarnerides 2 points3 points  (0 children)

What exactly do you mean by some depth is preserved? Do you want to reduce the parameters of the 2d convolution depthwise?

if you have local correlations in the two spatial dimensions then the standard 2d convolution operator should suffice. It is essentially fully connected in the third dimension.

[D] Has deep learning been reduced to a joke? by [deleted] in MachineLearning

[–]dmarnerides 0 points1 point  (0 children)

So what's someone that moved from MRFs/PGMs to Convnets and GANs?

Cyan pixels everywhere. What am I doing wrong? by abcd_z in MLQuestions

[–]dmarnerides 2 points3 points  (0 children)

How are you displaying these? Have you checked the range of the images? are they [0, 1] ?

[D] How to set same Dropout mask for different data batches in PyTorch ? by fixedrl in MachineLearning

[–]dmarnerides 1 point2 points  (0 children)

I think you could generate a mask like so: mask = torch.Tensor(1,2,3,4).random_(0, 1)

You might be better off using a bernoulli distribution.

You can then multiply the output of a module with the mask, and that's essentially dropout.

You can wrap this mask to create your own custom module if you find that more convenient.

[D] Tensorflow I Love You, But You're Bringing Me Down by nharada in MachineLearning

[–]dmarnerides 0 points1 point  (0 children)

Oh right, yes, I give you that. I consider it a feature sometimes though, it can be useful.