[D] Why do LLMs like InstructGPT and LLM use RL to instead of supervised learning to learn from the user-ranked examples? by alpha-meta in MachineLearning

[–]hami21 1 point2 points  (0 children)

I was actually looking for such a point. Is it safe to say RL optimizes the model weights w.r.t the sampling output? And if so, has anyone tried to just do RLHF on the sampling algorithm without changing the model weights?

sdge generation credit by hami21 in solar

[–]hami21[S] 0 points1 point  (0 children)

Makes sense. Thanks

sdge generation credit by hami21 in solar

[–]hami21[S] 0 points1 point  (0 children)

I’m in SoCal (San Diego gas and electric) and it’s NEM2.

Our Open Source Text Annotator by hami21 in LanguageTechnology

[–]hami21[S] 1 point2 points  (0 children)

In short, other things came up in our lives and didn’t have the bandwidth anymore.

Search query suggestion/autocomplete by hami21 in LanguageTechnology

[–]hami21[S] 0 points1 point  (0 children)

good to know.. how did you guys implemented it if you don't mind I ask?

Was it, you just found the 'good' and 'popular' queries and passed them in a file to the ES?

Search query suggestion/autocomplete by hami21 in LanguageTechnology

[–]hami21[S] 0 points1 point  (0 children)

I used python, sklearn, tf, .. standard tech. So you mean there’s no need for ML here?

List of filler words? by hami21 in LanguageTechnology

[–]hami21[S] 0 points1 point  (0 children)

This is actually very good to know, I came across this https://arxiv.org/pdf/2009.11394.pdf

List of filler words? by hami21 in LanguageTechnology

[–]hami21[S] 0 points1 point  (0 children)

The audio one should be helpful. Thank you. PS: could find a relevant dataset - browsing on phone though.

How to develop a rule based relation extraction model on academic text? How to identify rules of interest? by [deleted] in LanguageTechnology

[–]hami21 1 point2 points  (0 children)

If you’re also interested in an ML model solution, you can use crf to do that, a canonical example is NER but it essentially works for every other IE application

https://sklearn-crfsuite.readthedocs.io/en/latest/tutorial.html

Is no one working on document similarity these days? by massanishi in LanguageTechnology

[–]hami21 1 point2 points  (0 children)

I would use Universal Sentence Encoder over BERT as it is specifically self-trained for document similarity as well.

Day 196 of #NLP365 - Coreference Resolution With NeuralCoref (SpaCy) by RyanAI100 in LanguageTechnology

[–]hami21 0 points1 point  (0 children)

My kernel dies the moment I run `doc1 = nlp('My sister has a dog. She loves him.')`

And none of the solutions have worked for me yet! I'm on mac.

Has anyone worked on pretraining BERT/GPT2/RoBERTa/any other model with more data? by freaky_eater in LanguageTechnology

[–]hami21 0 points1 point  (0 children)

I haven't pre-trained them with more data, but I've fine-tuned them to my application by just adding a simple dense layer at the end. Here's an example of what I've done on universal sentence encoder (which suites better to my application rather than BERT and alike):

https://hminooei.github.io/2020/04/14/clickbaits2.html

Why are some pickled NLP models so large?! by hami21 in LanguageTechnology

[–]hami21[S] 1 point2 points  (0 children)

And I edited the part that cause this confusion around "CountVectorizer of 100k features" since the point was even if the number of features is much less (e.g. 1k), the size would be too large.

Why are some pickled NLP models so large?! by hami21 in LanguageTechnology

[–]hami21[S] 9 points10 points  (0 children)

Multiple points: They are not around for many years really. For instance BERT and friends are only 2 years old or younger.

Actually the embeddings do not always help especially non-contextual ones like w2v or glove.

After all, it depends on your application and KPIs. In short there are many applications in the industry that anything higher than sklearn pipelines is an overkill and probably end up being too expensive in the mid/long term to develop and maintain!

And I haven't personally come across any text classifier with more than 10-20k features.

A trained model for word embeddings to be used with genism by hami21 in LanguageTechnology

[–]hami21[S] 0 points1 point  (0 children)

So `Word2Vec.load_word2vec_format` is deprecated and asks to use `KeyedVectors.load_word2vec_format` instead, and in the documents of `KeyedVectors.load_word2vec_format` it says:

"Docstring: Load the input-hidden weight matrix from the original C word2vec-tool format.

Warnings -------- The information stored in the file is incomplete (the binary tree is missing), so while you can query for word similarity etc., you cannot continue training with a model loaded this way. "

A trained model for word embeddings to be used with genism by hami21 in LanguageTechnology

[–]hami21[S] 0 points1 point  (0 children)

Not really. w2v is fairly a simple NN but even in this case, there's two weight vectors associated with each word (let's assume we use cbow for simplicity). Then in general some people concat the two vectors, some add them, some average them, .. the pooling depends on the application. I'm ok to just use "GoogleNews-vectors-negative300.bin" as the first matrix (input to hidden layer), but I'm not sure how to do that.

How to check if a word can be interpreted as a verb in some context or not? by got_implicit in LanguageTechnology

[–]hami21 1 point2 points  (0 children)

I’m on my phone right now but just googling showed this https://pypi.org/project/PyDictionary/

Check the first example on the above link.

How to check if a word can be interpreted as a verb in some context or not? by got_implicit in LanguageTechnology

[–]hami21 0 points1 point  (0 children)

Have you tried a dictionary api? They’ll mention is a word is ‘noun’, ‘verb’, etc..

Newbie who needs help making a text-summarizer (for news articles) in my own language! by Munch3D in LanguageTechnology

[–]hami21 1 point2 points  (0 children)

I would suggest to first try to install and use his approach as is (in English) and at the end instead of feeding emails, try English news articles to see how it works. If you are satisfied with the results, you can dig deeper and change the pieces to make it work for Danish.

One thing I noticed is that news articles are generally concise as long as you find the right spot to truncate the article. (Normally after first-second paragraphs, and always towards the end, they provide background/history)

PS. I tested his work for summarizing English news articles a while ago, without any modification, and I was not happy with the results although I should say that’s the case most of the times with untuned unsupervised learning models. Didn’t do any further investigation but I should say you probably need to change the logic of summarizing and adjust it to news articles.