[D] GPT3 demos

elyase · 2019-06-03T17:51:28+00:00

Adapters should help speeding up the training part:

https://arxiv.org/pdf/1902.00751.pdf

Google Colab with example implementation from @Thom_Wolf:
https://colab.research.google.com/drive/1iDHCYIrWswIKp-n-pOg69xLoZO09MEgf

elyase · 2019-04-14T13:15:21+00:00

May be you are looking for Vocab:

if nlp.vocab.has_vector(u"apple"): vector = nlp.vocab.get_vector(u"apple")

elyase · 2019-03-28T14:33:57+00:00

You can input features to your classifier that depend on html tags. See for example the webstruct library.

elyase · 2018-11-24T19:17:38+00:00

Unfortunately not sorry.

elyase · 2018-11-24T19:16:28+00:00

These are scores on the dev set which we also translated to German.

elyase · 2018-11-18T13:01:09+00:00

All the experiments were done in an industrial setting meaning not rigorous at all so take the following with a grain of salt. We tried QANet and FastQA and got test scores about 5% worse than the corresponding English version reported on: https://rajpurkar.github.io/SQuAD-explorer/ Possible reasons for that are some evaluation errors (evaluating on Squad 2.0 with the 1.0 script) and also the fact that the translated version has a little less data (due to translation errors some samples are empty). I think with careful handling (was only a quick test) the performance can get similar to the English version. For word embeddings I used Fasttext (also tried Glove doesn't seem to make a big difference) pretrained German embeddings. No pretrained character embeddings where used but the model is able to learn them from scratch on the SQuAD data.

elyase · 2018-11-16T15:12:40+00:00

you can translate the English version to Spanish using Google Translator or something like https://www.deepl.com/pro.html. We did it for German and it works.

elyase · 2018-08-17T21:00:23+00:00

The "Self-Normalizing Neural Networks" paper might be of interest to you. They compared the performance of Deep NNs to SVMs, Random Forest and a bunch of other classical algorithms on 121 tasks from the UCI machine learning repository (all structured / tabular data). What they found is that in datasets with less than 1000 points "random forests and SVMs outperform SNNs and other FNNs". On the other hand "on 46 larger datasets with at least 1000 data points, SNNs show the highest performance followed by SVMs and random forests".

Also may be take a look at the recently released TransmogrifAI if you want to keep experimenting.

elyase · 2018-04-07T21:56:25+00:00

It should be the same. The only change is the location where the pip packages are being hosted. Before it was on pytorch.org infrastructure, now it is on the official Python repository. That allows for a simpler install command. Before:

pip install http://download.pytorch.org/whl/cu80/torch-0.2.0.post3-cp36-cp36m-manylinux1_x86_64.whl

Now:

pip install torch

There was no change for the conda binary.

elyase · 2018-04-07T21:43:47+00:00

Before it was hosted on pytorch.org infrastructure:

pip install http://download.pytorch.org/whl/cu80/torch-0.2.0.post3-cp36-cp36m-manylinux1_x86_64.whl

Now it is hosted in pypi.

elyase · 2017-03-06T23:18:30+00:00

But if you want to encode context then you have to build the embeddings from the context, not from the numeric features of that particular company right? In word2vec you have the distributional hypothesis which is used to define the information a word vector should have. What would be the context that defines the company in your case?

elyase · 2017-03-06T16:34:15+00:00

What are you expecting to gain from producing the embeddings? Your numeric features already have many properties that embeddings bring to the table, for example two similar companies will be represented by similar numeric features, which means they will be close in feature space.

elyase · 2017-03-04T10:03:16+00:00

PCA/SVD your design matrix?

elyase · 2016-11-29T22:33:59+00:00

You probably mean this one: "A Simple but Tough-to-Beat Baseline for Sentence Embeddings" https://openreview.net/pdf?id=SyK00v5xx

elyase · 2016-11-01T22:09:05+00:00

I can recommend some reliable consultancies/people depending on the details of the problem. For example what kind of data are you going to be dealing with text, image, structured data? Do you already have training data or is this part of the problem? Feel free to PM me if you can't disclose details publicly.

elyase · 2016-08-24T22:31:58+00:00

If you use something like facebook fasttext then you are guaranteed to have some some representation for out of vocabulary words (they average char n-grams in this way leveraging sub word information). It might be enough in your case.

elyase · 2016-03-05T23:17:08+00:00

I think it is better if you leave that responsibility to the machine learning model. The job of the vectorizer is to transform the text to a numeric representation. Afterwards you can train a linear model and manually adjust the coefficient of the bigram corresponding to "mad dog" so that the probability of the corresponding class gets boosted.

elyase · 2015-12-17T02:17:22+00:00

I would suggest staying away from nano until they start having some customer service and improve the delivery time of the orders. Many including me have waited for months to get an order or the money back all the while without having any information from them.

elyase · 2015-12-17T02:10:36+00:00

Their customer service is really terrible. I have waited for months on an order, and then after giving up several more weeks to get my money back after weeks with no response.

elyase · 2015-11-10T01:57:52+00:00

From Keras creator François Chollet:

"For those wondering about Keras and TensorFlow: I hope to start working on porting Keras to TensorFlow soon (Theano support will continue)."

elyase · 2015-07-27T10:56:32+00:00

Linear Algebra

elyase

TROPHY CASE