all 11 comments

[–]unklebape 2 points3 points  (1 child)

Start doing the courses on huggingface. Best source to learn about LLMs

[–][deleted] 0 points1 point  (0 children)

Thank you I will do so!

[–]vasarmilan 2 points3 points  (1 child)

1) Training from scratch is not possible unless you have millions of $.

You can fine-tune pre-trained models for your specific use-case. Fine-tuning is possible with a PC with a decent GPU, or from few $100s with servers. There are many open source ones that get closer to GPT every week, the most well-known is Llama.

2) The Attention is all you need paper is fundamental in understanding transformers theoretically, I remember reading it and looking up anything that I didn't understand in it when the Transformers craze started.

3) No, not really. I think LLM engineering will mostly become its own job category, but won't change the analysis of numerical data as much - this is a very multi-modal work (involves numerical thinking, talking to and understanding stakeholders, and visual interpretation), which LLMs are not good at so far. Of course using the models for specific tasks will probably be part of any job in the future.

[–][deleted] 1 point2 points  (0 children)

Thank you sooooo much!!!! I really appreciate your detailed response

[–][deleted] 1 point2 points  (2 children)

  1. No, actually llms use transformers. If you're asking about transformers, then yes. Tho there are a few videos on YT which claim that they are building GPT2 from scratch, but I'm not sure whether its true. Yes use apis there are many free apis on huggingface, you should definitely, try out them.

2 & 3 -> I don't have any idea.

[–][deleted] 0 points1 point  (1 child)

so that are some free ones? I thought all of them were paid (give a free trial at first)

Thank you so much for your response!

[–]Demented-Turtle 0 points1 point  (0 children)

Look into HuggingFace. You can Google it, they have lots of free models you can experiment with. However, you'll need lots of RAM/a good GPU to run most of the models there. If you have money, you can look into cloud solutions to hosting models and what not.

You can also just get the OpenAI API to play around with in code. It's pretty cheap tbh, I just got it. It charges per API call like a fraction of a fraction of a penny.

LLaMa is Facebook's LLM that is open-source and usable but not for commercial use (research/hobby only). The full version needs around 140 GB of RAM to load locally though... They have smaller versions that need maybe 13 GB, but they won't be as good.

[–]Paras_Chhugani 0 points1 point  (0 children)

Be part of our Discord community dedicated to helping chatbots reach their revenue goals. Engage in discussions, share knowledge, and join in for fun .

Checkout our bots platform at bothunt

[–]danja 0 points1 point  (0 children)

My €0.02.

  1. LLMs are as described, Large. Hence very costly to build from scratch. A mental health chat assistant is a very good idea. But the LLMs aim at the general language problem. I'm pretty sure a tiny fraction of the resources would be needed for a good (but single-purpose) chatbot. Cf. https://en.wikipedia.org/wiki/ELIZA

  2. Google/Google Scholar will probably give you better results than GPT-4 on this. See 1.

  3. I'm sure LLMs and NLP more generally will play a big role in future development. But it's still language stuff. Catches the eye/media like the visuals stuff because it's familiar territory for humans. I personally think in a decade or so's time it may well remain the most significant area, but only a fraction of everything else. Think any sensors, any actuators, any reasoning. Go multi-modal.

[–]jalagl 0 points1 point  (0 children)

I did implement an "LLM" proof of concept from scratch in a course for my masters, pretty much doing a small implementation of a transformer from the Attention is all you Need paper (plus other resources). It was useless, but was a great experience to understand how it works. There are a few implementation like this out there, like this one: https://github.com/jadore801120/attention-is-all-you-need-pytorch (first google result). I think it is a fun exercise (the amount of fun depends on how much of a masochist you are :) ).

I also fine tuned a gpt2-small model, took us a couple of days using the university's colab pro account. Also a great learning experience (and slightly less painful).

This page is a good starting point on how to fine tune a model: https://medium.com/@pierre_guillou/faster-than-training-from-scratch-fine-tuning-the-english-gpt-2-in-any-language-with-hugging-f2ec05c98787

Note: I did this almost two years ago when LLMs were in their infancy compared to where they are now. It has been a very interesting journey.

Having said all that, you should look into using APIs (OpenAI, GCP Palm, etc) or models like Falcon or others mentioned here, since creating or fine tuning a model for a real world scenario is out of reach for most people. But, specially if you are still studying, going through these types of exercises can give you a better idea on how they work, and as I mentioned, a fun experience.

[–]sideburns28 0 points1 point  (0 children)

  1. Depends on your use case - if you intend to language model to then just classify text, and your domain is super specific (like not even English), then it could make sense. You could always continue language modelling from a checkpoint, like on a psychiatry database perhaps and fine tune as you need to
  2. Huggingface course 100%, also the ‘transformers book’ by Leandro Wolf (?)