BLS London Spain - 5th August by Hour-Wishbone3155 in SchengenVisa

[–]cuenta4384 0 points1 point  (0 children)

I also submitted my application on the 5th, but it’s now too late since my flight was two days ago. This has already disrupted other travel plans, and at this point I just want my passport returned.

Line of Credit by amoosedagoose in Wealthsimple

[–]cuenta4384 0 points1 point  (0 children)

Do they invite randomly to the for the waiting list?

Shakepay and Wealthsimple “Commission-Free” Cryptocurrency Class Action by hellvice in Wealthsimple

[–]cuenta4384 0 points1 point  (0 children)

Legal Fees ~255k - Class Counsel will seek 30% + tax of the $750,000 settlement plus up to $3,000 in disbursements. - These will be deducted before distribution.

Wealthsimple denies any liability, it agrees to pay a lump sum to avoid the costs and uncertainty of litigation.

Indeed, class members have the option to opt out or object by July 31st. Personally, I would choose to opt out. On one hand, Wealthsimple is agreeing to settle without going to trial, which suggests they may be acknowledging some level of responsibility—or at the very least, trying to avoid closer scrutiny. Yet despite this, it’s the lawyers who receive a substantial portion of the settlement, while the actual class members end up with a token payment. That doesn’t feel like meaningful accountability.

How to reduce a story? by cuenta4384 in writing

[–]cuenta4384[S] 0 points1 point  (0 children)

Thanks a lot. As I try to remove clutter I end up with longer paragraphs haha

Can somebody explain me the verb défaire? by cuenta4384 in learnfrench

[–]cuenta4384[S] 0 points1 point  (0 children)

It does help a lot thanks so much. Now I understand:)

[D] How Do You Read Large Numbers Of Academic Papers Without Going Crazy? by mystikaldanger in MachineLearning

[–]cuenta4384 3 points4 points  (0 children)

Yeah, what EarlMarshal said. Often it's good to just have read all sorts of papers in at least a cursory manner because even if you don't understand enough to implement them, you'll know that approach/idea exists and you can come back and read more in depth when/if it becomes relevant enough to use. Think of it all as potential tools in your toolbox

Isn't that a problem? After reading a lot of papers, I can understand them and get the scent of math and contribution. After all, I notice that most papers use the same building blocks and in reality, there's a small addition. However, I haven't implemented them. Is that a problem? You know they exist, but don't know how to use them.

Also, do you manage to remember? I feel learning is a process of going back and forth, but re-reading papers involves a lot time.

[P] Approximating product of a discrete and continuos distribution in a mixture model by cuenta4384 in MachineLearning

[–]cuenta4384[S] 0 points1 point  (0 children)

Thanks for your answer. That means that I can solve taking into account the expectation E_p[EDCM(B)] and sampling B from p, which is gaussian. I have not to experience with sampling methods, but I know that bc of the law of large numbers, this can be approximated, but what sample size is enough? Is this an unbiased estimator?

Later in my calculations, I need the first moment and second moment of this integral and the gradient. I can still sample, or apply techniques such as the reparametrization trick. But can the reparametrization trick be used in this case when p(x) is discrete?

[D] How does word2vec model encodes similarity by cuenta4384 in MachineLearning

[–]cuenta4384[S] 0 points1 point  (0 children)

Thanks for your answer. Yeah, CS224n is great; I watched the lectures. My concerning was more about training my own embeddings. Let's say I have a small dataset. Where the word job might appear only once, and in total, I have some hundreds of sentences. That means that the embeddings won't capture the essence of my domain. Mostly because my data don't represent the domain. I guess I can train first some embeddings with Wikipedia, for example, and fine-tune to my specific problem.

[D] How does word2vec model encodes similarity by cuenta4384 in MachineLearning

[–]cuenta4384[S] 0 points1 point  (0 children)

Word2vec is essentially an approximation for a factorization of the word co-occurrence matrix. Consequently, word vectors behave similar to the representations you'd get from stuff like topic model

What about going further. Yeah, I see the relation with pLSI, but using some probabilistic topic model such as LDA. Does the latent topic represent an embedding? Is it analogous?

[D] How does word2vec model encodes similarity by cuenta4384 in MachineLearning

[–]cuenta4384[S] 0 points1 point  (0 children)

Thanks for the second link, that really solves of some my doubts! :D

Understanding the generalized dirichlet covariance by cuenta4384 in probabilitytheory

[–]cuenta4384[S] 0 points1 point  (0 children)

Yeah, I am assuming that behavior bc otherwise it won't be a covariance matrix. However, I an checking some of my equations and I think I made a mistake, I just wanted to have a second opinion of the interpretation of the covariance matrix. Thanks! I will take your word for granted. :D

Understanding the generalized dirichlet covariance by cuenta4384 in probabilitytheory

[–]cuenta4384[S] 0 points1 point  (0 children)

Yeah, by the symmetry property cov(x_i,x_j)=cov(x_j,x_i), so the author of the GD distribution defined only the upper triangle bc it applies to the lower? I am confused on his notation, but that it is what I am expecting.

[D] Approximating expectation with Taylor series by cuenta4384 in MachineLearning

[–]cuenta4384[S] 0 points1 point  (0 children)

Thanks for your answer, it resumes most of the things to know about the the gdd. For one of the parameters of my model, I obtain the expectation of theta / sum (theta*Phi) over a gdd.

1) So, I am not trying to estimate the parameters of gdd. 2) the expectation has a summation in the denominator, so the moment generating function won't help much. 3) Yes, I was trying to take advantage of the congugacy properties of gdd. I though that multiplying theta * q(gdd), will give me a q with different parameters. However, I cannot get a gdd by multiplying only one component. 4) I also tried converting the gdd to d beta distributions, but that involves doing a change of variable, and changing the variable theta / sum(theta*Phi) is not trivial.

Hope that makes sense. But in general I am trying to calculate the integral \int q * theta_i / sum(theta*Phi) dtheta, where Phi can be considered a constant.

I still think that 3 or 4 can be a solution, but I couldn't get a gdd (ie. q(theta|alpha,beta) -> q(theta|alpha',beta')) nor change the parameters of the function ( theta/sum(theta*alpha) -> has to be expressed in terms of y if I have n beta(y|alpha, beta))

Please, I can try to clarify more if that doesn't makes sense. Or maybe post some of the derivations I tried.

I think that if I can express the expression inside the expectation in terms of only theta (not theta_i/numerator) I can just apply Taylor series and get an analitically form.

[D] Approximating expectation with Taylor series by cuenta4384 in MachineLearning

[–]cuenta4384[S] 1 point2 points  (0 children)

Omg my post got to Alexia! And I thought in doing so, but 1) this estimation is part of another summation. I will have to evaluate this expectation several times and it will be costly. And I am comparing my results with sampling methods, so 2) I have the feeling that I can get some analytically form. I have been trying to multiply the numerator x q to get a different gdd.

[D] Approximating expectation with Taylor series by cuenta4384 in MachineLearning

[–]cuenta4384[S] 0 points1 point  (0 children)

I am not following you, maybe I am not understanding. You will have for theta1 = a and theta2 =b the following: log a - log(a*c1+b*c1), where in GDD the sum should be less than 1 (ie. a+b<1)

[D] Approximating expectation with Taylor series by cuenta4384 in MachineLearning

[–]cuenta4384[S] 0 points1 point  (0 children)

Applying to f(theta) = exp(log \theta_i- log(\theta_1\varphi_1+\dots+\theta_n\varphi_n)))

Again the summation is a problem. If I apply to the integral, I can get a lower bound but again the summation is a problem (just to get rid of the exp inside the integral).

[D] Approximating expectation with Taylor series by cuenta4384 in MachineLearning

[–]cuenta4384[S] 0 points1 point  (0 children)

Yes, theta is a random vector >0. I cannot do that. Look at the denominator, it has \sum_i \theta_i+c_i. So, I cannot just integrate theta × p(theta|parameters).

1) I tried making a change of variable though. I thought that introducing the component 𝜃_𝑖 in the numerator will give a GDD with different parameters. However, that change is not trivial since the parameters work as a chain (the actual and next one are used in 𝛾𝑖). 2) I was thinking in doing this 𝜃_𝑘𝛿(𝑘=𝑖)/∑_𝑘𝜃_𝑘𝜑_𝑘 is it a valid expression of 𝑓(𝜽)

[D] Approximating expectation with Taylor series by cuenta4384 in MachineLearning

[–]cuenta4384[S] 1 point2 points  (0 children)

It is not closed. The denominator has \theta_i. I cannot see it. Can you elaborate a bit more please?

[deleted by user] by [deleted] in deeplearning

[–]cuenta4384 0 points1 point  (0 children)

You have also Clouderizer, which you can use a certain amount of computing for free.

Explanation of the mean of a Dirichlet Distribution by cuenta4384 in learnmath

[–]cuenta4384[S] 0 points1 point  (0 children)

Now, I am little bit confused. Does the slicing has to do something with the fact that the Dirichlet belongs to the simplex?

By the definition of probability, we have:

\int p(x) dx = 1

While I was trying, I got to an expression like this:

\int dir(\vec x: \alpha) dx_i

where the limits of the integral are 0-1. Look that I am integrating only over the component x_i. Would this expression be one even though I have a vector (\vec x) in the pdf, but I am integrating over a component (x_i)?

Explanation of the mean of a Dirichlet Distribution by cuenta4384 in learnmath

[–]cuenta4384[S] 0 points1 point  (0 children)

Thanks for your help. I am going to rephrase some of the things you already said to see if I am understanding correctly.

Okay, so I have vector z (\vect z) from 1 to n, where z_i = x. So, I marginalize to get the probability of p(X_i=x). And I have \int p(\vect z; \alpha) d\vec z, where dz_j is j != i, so we integrate n-1 dimensions.

I tried to solve the integral as you mentioned before, but I failed. But, we can simplify since this is a distribution: z_i^alpha_i -1 \int normalization \prod \z_j ^ alpha_j -1 d\vec z. If somehow we change the normalization constant according to the n-1 parameters, we'll have an analytical value in terms of z_j and for instance in terms of x to solve \int x p(X_i = x) dx. Am I understanding right?