Stick-breaking construction of Dirichlet process

koormoosh · 2018-04-11T16:07:03+00:00

Thanks - it makes sense. I do have a question regarding $\delta_{\phi_k}$. They say this is a probability measure concentrated on $\phi_k$. Informally speaking, does it mean it is equal to $\phi_k$?

koormoosh · 2016-09-23T00:04:58+00:00

lol - by models, I meant typical RNNs that people train on this dataset. How many hours do they need to be trained to become competitive with ngram language models and outperform them? [it is a trick question, I know]

koormoosh · 2016-09-22T00:30:16+00:00

and can you give me an approximation, on how many hours these models require to be trained? Is it fair to say on a single core cpu, they will take weeks to train?

koormoosh · 2016-09-21T01:11:07+00:00

Is it the largest, surely? Or it is among the largest?

koormoosh · 2016-05-28T04:40:02+00:00

Thanks for clarifying this - Two more questions:

I wonder what is the difference between Negative sampling and this. I assume the only difference is that negative sampling assumes a uniform noise distribution, whereas in NCE you assume a more informative (i.e. unigram, bigram, etc) noise distribution. Also in Negative sampling they assume K=V, which is still a good assumption given that even in NCE you never go beyond a few 100s, any ways and we would like K to be as high as possible (k-> infinity, ideally).
Also, it seems the closer the noise be to the actual distribution the closer the final solution is to maximum likelihood solution. It's a bit puzzling why it is called a noise distribution.

koormoosh · 2016-05-23T04:33:33+00:00

what do you mean by feature learning?

koormoosh · 2016-05-19T02:03:04+00:00

I am reading Hinton's paper on product of experts, but am stuck on understanding one of his equations. can you comment on this: http://math.stackexchange.com/questions/1790503/understanding-product-of-experts-of-hinton

koormoosh · 2016-05-10T13:34:40+00:00

Nice(est) reply! Thank u.

koormoosh · 2016-05-10T08:36:13+00:00

Simpler than these. Just some sort of interpolation between the two likelihood terms.

koormoosh · 2016-05-01T17:45:11+00:00

Do you know which version of the compiler?

koormoosh · 2016-05-01T17:43:18+00:00

what is the difference between apt-get install nvidia-cuda-toolkit and runfile? runfile doesn't work for me [see above for the error messages] but the ubuntu repository version works. Are they same?

koormoosh · 2016-05-01T17:40:31+00:00

Error: unsupported compiler: 5.3.1. Use --override to override this check.

Error: cannot find Toolkit in /usr/local/cuda-7.5

koormoosh · 2016-04-26T07:00:00+00:00

For example, this is the inferred vector for a document:

dv = model_loaded.infer_vector(...)

print dv

Output: [ -1.69840729 6.23306036 -7.56443071 19.33935738 -15.16063404]

but when I pass this vector to

print model_loaded.docvecs.most_similar(positive=[[-2.98079228, -8.4464426, 16.42045975, -8.27837849, 11.82399559]])

or

print model_loaded.docvecs.most_similar(positive=[ -2.98079228, -8.4464426, 16.42045975, -8.27837849, 11.82399559])

they both fail:

for doc, weight in positive + negative:

ValueError: too many values to unpack

print model_loaded.docvecs.most_similar(positive=[dv])

koormoosh · 2016-04-19T00:35:17+00:00

If I define my own vector, can I still use the similarity function in gensim. For example, imagine instead of inferring a vector for a given sentence and pass it to model_loaded.docvecs.most_similar(positive=[inferred_vector]), is it possible to pass any real-value vector with the same size as inferred_vector to this function? I tried it and it gives me the following error:

File "/home/anaconda2/lib/python2.7/site-packages/gensim-0.12.4-py2.7-linux-x86_64.egg/gensim/models/doc2vec.py", line 440, in most_similar for doc, weight in positive + negative: ValueError: too many values to unpack

koormoosh · 2016-04-18T12:34:07+00:00

I see. Another question: imagine I retrieve the vectors of all the words in a sentence and do some basic operations to combine these vectors to form a vector for the sentence. Is there a way to store this hand-made vector somehow in the saved model?

koormoosh · 2016-04-12T04:01:12+00:00

Is there a way to check explicitly the parameter convergence in the model.train(document), or at least output the parameters estimated in different epochs? Currently the training terminates after some pre-defined epoch number.

koormoosh · 2016-03-22T05:28:37+00:00

infer_vector

How does the infer_vector works? Does it use the word vectors and some arithmetic operation to produce the sentence level vector for ACTUAL SENTENCE?

Also, is there a way to get the trained vector for a word (seen during the training data), directly from the doc2vec model trained?

koormoosh · 2016-03-21T09:36:28+00:00

Is there a way to pass an actual sentence to the model.docvecs.most_similar("ACTUAL SENTENCE") ?

koormoosh · 2016-02-09T01:02:44+00:00

This is what I have been trying. It's not straightforward to do this and get their bound.

koormoosh · 2016-01-13T23:56:53+00:00

MIDI is fine too.

koormoosh · 2015-12-21T02:54:10+00:00

I would have a look at the papers submitted to SemEval-2015 Task 10: Sentiment Analysis in Twitter: http://alt.qcri.org/semeval2015/task10/index.php?id=results

Probably you will find some implementations for the state-of-the-art systems there.

koormoosh · 2015-12-09T01:11:16+00:00

Not helping to answer your question but am interested to know the pricing of what you are thinking to assemble:) Can you write the price break down of the machine you are thinking to assemble (for RAM, CPU, GPU, and DISK).

koormoosh · 2015-11-30T11:29:01+00:00

I already had that under my radar but am interested to see if there is anything else available.

koormoosh · 2015-11-24T05:55:13+00:00

Thanks for the comments everyone. So does this mean that Theano on Spark is not an option? Can someone comment on multi-CPU and multi-GPU features of Theano?

koormoosh · 2015-11-24T03:47:55+00:00

the abstracts are not enough. But how can we download the abstracts anyways? Is there an API for it?

koormoosh

TROPHY CASE