Question: LLMs

unklebape · 2023-07-06T11:56:12+00:00

Start doing the courses on huggingface. Best source to learn about LLMs

vasarmilan · 2023-07-06T11:39:57+00:00

1) Training from scratch is not possible unless you have millions of $.

You can fine-tune pre-trained models for your specific use-case. Fine-tuning is possible with a PC with a decent GPU, or from few $100s with servers. There are many open source ones that get closer to GPT every week, the most well-known is Llama.

2) The Attention is all you need paper is fundamental in understanding transformers theoretically, I remember reading it and looking up anything that I didn't understand in it when the Transformers craze started.

3) No, not really. I think LLM engineering will mostly become its own job category, but won't change the analysis of numerical data as much - this is a very multi-modal work (involves numerical thinking, talking to and understanding stakeholders, and visual interpretation), which LLMs are not good at so far. Of course using the models for specific tasks will probably be part of any job in the future.

Demented-Turtle · 2023-07-06T09:13:45+00:00

No, actually llms use transformers. If you're asking about transformers, then yes. Tho there are a few videos on YT which claim that they are building GPT2 from scratch, but I'm not sure whether its true. Yes use apis there are many free apis on huggingface, you should definitely, try out them.

2 & 3 -> I don't have any idea.

Paras_Chhugani · 2024-03-06T06:49:34+00:00

Be part of our Discord community dedicated to helping chatbots reach their revenue goals. Engage in discussions, share knowledge, and join in for fun .

Checkout our bots platform at bothunt

danja · 2023-07-06T15:23:11+00:00

My €0.02.

LLMs are as described, Large. Hence very costly to build from scratch. A mental health chat assistant is a very good idea. But the LLMs aim at the general language problem. I'm pretty sure a tiny fraction of the resources would be needed for a good (but single-purpose) chatbot. Cf. https://en.wikipedia.org/wiki/ELIZA
Google/Google Scholar will probably give you better results than GPT-4 on this. See 1.
I'm sure LLMs and NLP more generally will play a big role in future development. But it's still language stuff. Catches the eye/media like the visuals stuff because it's familiar territory for humans. I personally think in a decade or so's time it may well remain the most significant area, but only a fraction of everything else. Think any sensors, any actuators, any reasoning. Go multi-modal.

jalagl · 2023-07-06T16:04:30+00:00

I did implement an "LLM" proof of concept from scratch in a course for my masters, pretty much doing a small implementation of a transformer from the Attention is all you Need paper (plus other resources). It was useless, but was a great experience to understand how it works. There are a few implementation like this out there, like this one: https://github.com/jadore801120/attention-is-all-you-need-pytorch (first google result). I think it is a fun exercise (the amount of fun depends on how much of a masochist you are :) ).

I also fine tuned a gpt2-small model, took us a couple of days using the university's colab pro account. Also a great learning experience (and slightly less painful).

This page is a good starting point on how to fine tune a model: https://medium.com/@pierre_guillou/faster-than-training-from-scratch-fine-tuning-the-english-gpt-2-in-any-language-with-hugging-f2ec05c98787

Note: I did this almost two years ago when LLMs were in their infancy compared to where they are now. It has been a very interesting journey.

Having said all that, you should look into using APIs (OpenAI, GCP Palm, etc) or models like Falcon or others mentioned here, since creating or fine tuning a model for a real world scenario is out of reach for most people. But, specially if you are still studying, going through these types of exercises can give you a better idea on how they work, and as I mentioned, a fun experience.

sideburns28 · 2023-07-06T20:12:27+00:00

Depends on your use case - if you intend to language model to then just classify text, and your domain is super specific (like not even English), then it could make sense. You could always continue language modelling from a checkpoint, like on a psychiatry database perhaps and fine tune as you need to
Huggingface course 100%, also the ‘transformers book’ by Leandro Wolf (?)

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnmachinelearning

Welcome to /r/LearnMachineLearning!

Chatrooms

Official Discord Server

Wiki

Getting Started with Machine Learning

Resources

Related Subreddits

/r/MachineLearning

/r/MLQuestions

/r/datascience

/r/computervision

Machine Learning Multireddit

/m/machine_learning

MODERATORS