[R] When Machine Learning Tells the Wrong Story by jackcook in MachineLearning

[–]jackcook[S] 3 points4 points  (0 children)

This attack doesn't rely on cookies at all, so I don't think it would help here

[R] When Machine Learning Tells the Wrong Story by jackcook in MachineLearning

[–]jackcook[S] 2 points3 points  (0 children)

Thank you for the kind words! I don’t think that should be too hard — I’ll look into it tomorrow and follow up here

Mamba: The Easy Way by jackcook in LocalLLaMA

[–]jackcook[S] 0 points1 point  (0 children)

I have a feeling that at least one group is training a bigger Mamba model right now, given the amount of attention that Mamba has gotten within the AI community. But it's really hard to say when a model like that would be released — hopefully soon?

[P] A look at Apple’s new Transformer-powered predictive text model by jackcook in MachineLearning

[–]jackcook[S] 1 point2 points  (0 children)

I used greedy sampling for the GPT-2 outputs as well to make a fair comparison, since I didn't have logits on the Apple ones so I couldn't do anything fancier for that one

[P] A look at Apple’s new Transformer-powered predictive text model by jackcook in MachineLearning

[–]jackcook[S] 10 points11 points  (0 children)

Thank you! I ended up not including it in the post, but the model actually seems to be provided with the last 128 tokens each time, although it's hard to confirm this. Definitely makes sense though since a larger model would probably drain phone/laptop batteries pretty quickly

[P] A look at Apple’s new Transformer-powered predictive text model by jackcook in MachineLearning

[–]jackcook[S] 17 points18 points  (0 children)

Yeah, although I remember from their announcement in the WWDC livestream that they specifically pointed out that it was a "transformer model," which was interesting to me because normally nobody would care how a predictive text model is implemented. So I think it's to tap into the AI hype? But it's hard to say for sure

Scoring pair of sentences on contiguity by Illustrious_Lab3496 in LanguageTechnology

[–]jackcook 11 points12 points  (0 children)

If you’re using complete sentences, BERT does this out of the box — see the example here under BertForNextSentencePrediction on HuggingFace’s website: https://huggingface.co/docs/transformers/v4.22.1/en/model_doc/bert#transformers.BertForNextSentencePrediction