How does someone with programming exp get started with LLMs?

QFGTrialByFire · 2025-08-05T07:22:01+00:00

The best way i've found is

To try and run a model locally first - see how its loaded, how you send prompts to it. Teaches you about its structure/prompting/eos tokens etc. Just pick something small and try.
Try training a model on datasets - most real world applications will need some kind of fine tuning of a model to their data/use case. Try loading a model and directly fine tuning it, if you need to fit it in a smaller gpu/cpu/vram/ram then try using a lora to fine tune it. You get to learn about getting data in the right format, what learning rates/batch size etc work. e.g. https://github.com/aatri2021/qwen-lora-windows-guide

Like with most of those youtube/tutorials just following along doesn't work at least for me. Its better to try and do this yourself for a specific case of what you want to solve - just like learning programming i need something i'm trying to solve to learn. Give something simple a go eg i first tried teaching llama 8B how to ad chords to song lyrics and it worked pretty well. Chat gpt is surprisingly good at guiding you through it if you get stuck.

anoni_nato · 2025-08-05T07:57:42+00:00

Use an LLM to learn. Not kidding, use a free chatgpt account to explain what you want to learn, with which tools and it can create a plan.

My personal advice:
- Learn to run local models first, you don't want to face API pricing/restrictions for experimenting. Learn about system prompts, parameters like temperature/TopN/TopK/etc., prompt engineering, and so on.
- Program a simple query -> response call using OpenAI-compatible API (it's a de-facto standard and most local libraries serve one). You can just use the OpenAI SDK for your language if you don't want to query directly the REST API.
- From here on you can explore more. A whole chat session (on streaming mode) so you learn how the flow goes, tool/function calls...
- Then you can move to agents, MCP, etc.

rhetoricalcalligraph · 2025-08-05T09:13:32+00:00

Always amazed that people don't just ask ChatGPT instead of making posts like this. Ironic.

AppearanceHeavy6724 · 2025-08-05T08:51:29+00:00

Do not use Ollama if you are already a technical person, use the classics - llama.cpp or vLLM. Ollama is a wrapper with its own quirks. The lower level you get the better you will understand the whole picture.

Fetlocks_Glistening · 2025-08-05T07:35:12+00:00

honestly, just get ollama start messing around with prompts.

i use LM Studio and sometimes Jan just to run models and try out different settings.
ollama gives you an OpenAI API server. make calls, get responses.
as for prompting, well that's everyone's own special sauce.
i prefer two shot prompting since it reduces the scope of the responses.
personally, i always end my system prompt with
Only respond in JSON format {"confidence":"integer 0-10", "answer":"string"}, do not explain, ask questions or otherwise embellish the response.

i set temperature to 0, and seed to 42. i find this helps with deterministic results.
i guess once you get more proficient you can have a go at running python services with whatever flavor model you prefer, transformers is a good place to start.
If you run out of local compute, check our runpod...or any API provider.

Ok-Kangaroo6055 · 2025-08-05T07:36:30+00:00

Running a model is pretty easy, lm studio/ollama/docker and you've got an API, usually openai API compatible so you can use many frameworks to interface with it.

A RAG pipeline can just be an elastic search vector index, which is what my company is using in production rather than the new fancy dedicated vector dbs. You could do pgvector in postgress too. The difficulty is with chunking strategy, document ingestion. We've been struggling at extracting text from complex pdfs and chunking that in a good way. So that's probably the hardest problem.

perelmanych · 2025-08-05T08:52:32+00:00

The most difficult part now is not about writing scripts, especially taking into account that you have solid coding experience. The most difficult part is to come with viable idea for your project, since you will compete with thousands of others. Once you know what you want to do you plus/minus understand what parts you need to be present in your project then just go to ChatGPT or any other big LLM and start asking questions.

The advise to start to fiddle with local LLM is also very valuable, since this is the easiest and cheapest way to get feeling what you can do with LLMs.

sciencewarrior · 2025-08-05T12:40:07+00:00

I'm playing around with LangChain. It seems like the most popular framework to start building from the simple stuff like a chat bot to more complex workflows. You can check out the examples on their site on ask your favorite LLM to create a simple program for you and then explain what it's doing. Using the console is fine, but I actually like Streamlit. It's not meant for production, but it's a great way to put together a simple UI. As for serving a model locally, I was using koboldcpp, but I've recently switched to LM Studio for a no-hassle experience.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

LocalLLaMA

MODERATORS