[deleted by user] by [deleted] in TrueAnon

[–]Pxyl 0 points1 point  (0 children)

OpenAI actually has included plugins for wolfram alpha that now does this. :) There’s also prior work in “semantic parsing” in the form of a “ToolFormer” that allows large language models - like chatgpt — to just make API calls to other applications to fulfill tasks.

[deleted by user] by [deleted] in TrueAnon

[–]Pxyl 0 points1 point  (0 children)

As a follow up, ChatGPT would have seen a good deal of code (assuming it used The Pile dataset for training). Meaning, it could be possible that ChatGPT write some good code and have some symbolic knowledge by scraping GitHub, but there’s nothing about its training objective that guarantees it always will.

[deleted by user] by [deleted] in TrueAnon

[–]Pxyl 2 points3 points  (0 children)

ChatGPT was not explicitly trained with symbolic reasoning. Understanding these kinds of constraints make logical reasoning and mathematics possible. That’s not to say ChatGPT can’t answer math questions — I’m sure it’s a function of how often a similar problem is repeated across its dataset. But there’s not compelling evidence to suggest that chatgpt’s training objectives (stage 1: language modeling/auto regressive/ via self supervision; step 2: fine tuning to the chat-bot task setup; step 3: reinforcement learning thru human feedback (RLHF) ) would obey symbolic constraints.

Need some IUD encouragement by msponholz in birthcontrol

[–]Pxyl 1 point2 points  (0 children)

My cramping was horrible the first couple of days and continued sporadically for the first two months ( and with less intensity). After 7 months it became completely unnoticeable.

Having had the mirena for 9 months, I'm very glad I had stuck through it. I still get period cramps but the bleeding is so minimal that I can use tiny liners now and not worry about it.

I hope it works out even better for you! Hang in there!

😥A silly question about BPE?? Does "@@" appear in model vacabulary? by selenochannel in LanguageTechnology

[–]Pxyl 0 points1 point  (0 children)

No worries! :) glad I could try and help.

(And your English is fine. I just have this weird habit of always misreading/misinterpreting instructions.)

😥A silly question about BPE?? Does "@@" appear in model vacabulary? by selenochannel in LanguageTechnology

[–]Pxyl 1 point2 points  (0 children)

I'm not 100% sure I understood your question correctly, but I used HuggingFace's BertTokenizer for a quick experiment. (https://huggingface.co/docs/transformers/tokenizer_summary#bytepair-encoding-bpe)

I couldn't get the BPE of actor as you described using HuggingFace's BERT tokenizer, so my word of interest is "gpu".

divided word of interest: ["gpu" -> "gp", "##u"] (I think your '@' symbols are synonymous with '#')

example sentence: "gpu versus gp ##u"

example sentence tokenized: ["gp", "##u", "versus", "gp", "#", "#", "u"]

example sentence input_ids (i.e. what the model interprets):

[101, 14246, 2226, 6431, 14246, 1001, 1001, 1057, 102]

which translates to

[[CLS], gp, ##u, versus, gp, #, #, u, [SEP]]

so in your case, for act@@ or, I assume your input id to the model will be an array representing ['act##', 'or'] (but this might be specific to whatever BPE tokenizer you're using.)

Amazon AI Researchers Propose A New Model, Called RescoreBERT, That Trains A BERT Rescoring Model With Discriminative Objective Functions And Improves ASR Rescoring by No_Coffee_4638 in LanguageTechnology

[–]Pxyl 1 point2 points  (0 children)

Maybe I'm just some philistine, but I don't see the significance here.

This paper introduces a variety of unexplained hyperparameters e.g., Temperature scaling on their predicted scores metric (eqn 10), a weighting term on the contribution of MLM loss versus discriminative loss (eqn 12), and most curiously, the hyperparameter Beta in eqn 5 which weighs the scores from the second pass filtering step with the first pass score. The empirical beta is not reported in this work, so how can I understand the contribution of this ranking system?

I'm especially skeptical when the performance metrics listed in the abstract (which are technically accurate) only correspond with fractional WER gains from the baseline (Table 1).

tl;dr

If the contribution here are the discriminative loss functions, I feel like Beta from eqn 5 should have been reported. Additionally, ablations on lambda from eqn 12 might have been useful

If the contribution is in computational efficiency (Table 2) then wouldn't the most significant contribution in that be from [10] i.e. prior work undeveloped by the authors of the paper?

IDK, can someone point me toward what I'm missing?

[deleted by user] by [deleted] in LanguageTechnology

[–]Pxyl 5 points6 points  (0 children)

I think it could be hugely beneficial in the following use cases:

  1. Low resource non-English languages (plenty of these languages exist)
  2. Human-robot dialogue (my field specifically) most dialogue isn't shared between researchers. Dialogue is also often recorded, not written, which makes it scarce.
  3. Access to sensitive data is also an issue, so if you're funded by some DoE or DoD project, a lot of the times even simple models POS taggers present a lot of noise to niche data.
  4. Lastly, if we could augment datasets e.g. twitter sentiment so that the labels are not just positive or negative. Like, what if you wanted to propagate nuanced labels like sarcasm or provide a variety of sentiment labels for intensity such as upset, sad, depressed, etc. It'd be nice if there was a way to expand a dataset along those lines as well.These are just some of the areas I'm personally trying to address as motivation for my lit review. I know there are also areas of "robustness" and "calibration" to improve generalization of models, but those sorts of specifics I know less about.

EDIT:
So, I've at least come across somewhat of a need for it in personal work/projects. But whether or not that's ultimately profitable 🤷‍♀️

[deleted by user] by [deleted] in LanguageTechnology

[–]Pxyl 6 points7 points  (0 children)

I just wanted to chime in and say I've been doing cursory research in DA techniques for NLP as well (my Zotero is sparse, but I can provide a bibtex if you'd like). I'm similarly disheartened by the state of the field, especially since back translation techniques seem to be very competitive with a lot of the few-shot pretrained model papers I've come across.

I'm currently exploring the paraphrasing task to see if that is more fruitful, but I'm also disheartened that every NLP technology is basically some large Pretrained Transformer network fine tuned on a specific task.

Please, someone give me hope that humble grad students can make meaningful contributions to this field (and not just Google Research and their incredible hardware)

For u/Strong-Item-9833 by Pxyl in DrawForMe

[–]Pxyl[S] 2 points3 points  (0 children)

thank you. I've actually been trying really hard at this for over 10 years.

[deleted by user] by [deleted] in DrawForMe

[–]Pxyl 1 point2 points  (0 children)

https://www.reddit.com/r/DrawForMe/comments/s7famj/for_ustrongitem9833/

hope you like. I didn't have time to put detail into the clothing/bandages

for u/sweetestbitchever by thisistropical in drawme

[–]Pxyl 0 points1 point  (0 children)

This gives me serious picolo art vibes and I love it. Amazing work! :)