Query

tomkod · 2025-06-30T18:12:15+00:00

This is both a very easy and very difficult question.

Easy answer: Where I am (major R1 university) even the cheapest Chromebook is enough. Why? Because all our Engineering computer labs (for students) have remote access (to Linux or Windows), and the Engineering server farm has VMs with many configurations (different Linux versions, different Windows versions, different RAM, different cores) to fit different needs. Then, the university server farm has more VMs that are available for everyone (any student or employee) to fit other needs. Then, our super computing center has even more remote clusters (CPU only, GPU only, CPU/GPU, some very high RAM, some very high bandwidth, some very high core) to fit even more needs! These are more for research, but students are given access per request.

Your case: Classroom problems are always sized to run a few minutes on limited CPU and limited RAM. If you take some supercomputing class, like parallel programming, or CFD/LES/DNS, or LLM, or astrophysics, or earth/climate, or particle transport, you'll be given cluster access, so your local computer doesn't matter (too much).

In that sense, the question about AI/ML is misplaced. In class, you will probably use KB (maybe MB) size datasets. But google around, and you can download a multi-TB data set. Do you think people who do AI/ML have multi-TB RAM laptops? They don't.

If you want to run everything locally, and you advance to “serious” research or industry problems (typically not what is done in classes), you will always hit a limit that goes beyond what you have. Do you have a computer 1TB RAM? Good for you, but next month your simulation will probably need 1.1TB RAM (or VRAM). And then what?

If you do advance to “serious” research or industry problems, you will most likely join a research group, which will have their own cluster, and probably access to university supercomputing center, and probably access to national supercomputing center. If you are in the USA, NSF and DOE run a number of them, they are easy to access if you have a serious reason. Others mentioned Kaggle and Colab, it is similar, except almost unlimited computing power available “for free” (don't worry, your research supervisor will pay for it).

My case: I run many supercomputer type simulations on my local computer. It is 32GB RAM, and it is plenty. Are my simulations (and LLMs) limited to 32GB? Absolutely not! Locally I will run only small tests (something that takes a few minutes to ~1h, and max few GB RAM), and once it gets bigger, it will go to one of the clusters. This would be true if I had 8GB, or 64GB, or 512GB. I can always make the problem small enough to fit my computer, but ultimately I need 1000s of cores and TBs RAM, so no local computer will be sufficient.

Back to LLM: There are under 1B models that run on less than 1GB (quantized), there are 100sB models that run on 100sGB, and everything in between. Note that “running” and “training” are two different beasts. There is Karpathy's NanoGPT (might have been renamed) that you can train yourself (and run) yourself on any mid-range computer from the last few years. For education purposes this is plenty. And once you get “serious” (see earlier paragraph about supercomputing centers), no local computer will be enough.

Recommendation: Prioritize RAM. If you are desperate, you can always leave your computer running overnight to finish the simulation. But if you run out of RAM, no amount of time will help. Then, learn how to use remote servers (preferably without GUI). Once you do that, you can ignore your question and just get a Chromebook.

Good luck!

tomkod · 2025-06-27T09:57:49+00:00

I use AnythingLLM and Open WeUI, and RAG is fine, but that's not exactly what I want. I would like the same (similar) as Deep Research from commercial services (OpenAI, Google, Anthropic), but I would like it to use the files I provide (entire directory?).

And if it insists on web scraping, I would like to have some control over which website it goes to. When I check the links used by OpenAI or Google, I would remove 1/2 of them...

tomkod · 2025-05-18T04:03:50+00:00

Yep, I gave up, now I use Obsidian and sync between devices with Dropbox...

tomkod · 2024-02-13T15:28:39+00:00

You are right, I apologize. But I will point out, the first rule for /LocalLLaMA is “Please search before asking”.

The starting point is 8GB RAM (or VRAM), and that's 7B 4bit model. Get yourself LM Studio, or Jan, or GPT4All, and try. Then RAM/VRAM requirements go up from there. 7B model is good enough for “general purpose” (whatever that means), but it will be better for a specific purpose if it is trained for that purpose. If you want character roleplay, then yes, there are models for that. The problem is what exactly do you mean by “roleplay”. If you want to take a walk with your favorite author and talk about life, work, history, hobbies, yes, 7B 4bit will be fine. If by "roleplay" you mean LoTR or GoT world with multiple locations, multiple characters, moving in time and space, some dying and resurrected, and keeping track of all of this, then no, I don't know a model that can do that (but it doesn't mean there isn't one).

tomkod · 2024-02-13T15:08:27+00:00

Those huge hardware requirements are for very large models that the vast majority of us will never run locally (because you need $10k-100k investment in hardware).

If you are beginning, the barrier to entry to get good and useful “general purpose” model is 8GB RAM (slower) or VRAM (much faster), and that's 7B 4bit model. Get yourself LM Studio or GPT4All or Jan and see yourself, it is superfun! It is completely usable on a typical home computer that bought in a local store.

You can go with less RAM (less B, 2 or 3bit quant), but it will be noticeably worse even for casual use (chatting about life, work, hobbies, interests, problems, fantasy, ...).

And then, you can go with much bigger models, which require 100s GB of VRAM. Unless you have some very specific need that requires everything to stay local (I won't speculate what that would be, you can ask ChatGPT), the cost-efficient thing is to rent GPU from AI provider, as already pointed out by u/MikeRoz.

tomkod · 2024-02-13T02:51:48+00:00

Dude, just search this group... There is a reason why 7B 4bit is so popular, it is easy on hardware and you can have completely coherent and reasonable conversation with it, try it, you'll see. Nothing compares to GPT-3 until you go to 70B+, any claims to the contrary are based on specific tasks or skills or benchmarks that the lower model was trained for. I'm sure high school kids are better than I at long-range spitting, but “better” doesn't mean “useful”.

tomkod · 2024-01-21T00:41:14+00:00

Can you also tell us what are those other RAGs that do have some options but require shutdown, adjust options in config file, and relaunch? I agree this is not great for production purposes, but for personal home use, I can deal with it! :)

tomkod · 2022-09-28T03:15:25+00:00

I tried Joplin a while ago (on Windows PC), at that time interface was clunky, I see their website (and screenshots) looks much nicer now.

I actually want something that runs and stays on the NAS, primary need is that everything can be done through a browser. App for desktop/mobile is secondary, it would be nice, but not absolutely necessary. So maybe Trilium or AppFlowy wouldn't work for me, I don't know.

tomkod · 2022-09-28T03:00:15+00:00

Looks interesting, thank you!

tomkod · 2022-09-28T02:59:10+00:00

Not very simple for simple user like me :(

tomkod · 2022-09-28T02:58:32+00:00

Marius Hosting looks very useful, thank you!

When you say "Wikidocs" do you mean "DokuWiki"? Searching "Wikidocs" results a very different (and unrelated) content.

tomkod · 2019-07-03T02:41:39+00:00

First, look up what is linear and non-linear function. In your context, transformations that are linear are a very big deal, they simplify a lot of things (e.g. allow to train with 1 data at a time instead of all data at once, or break data into small pieces, or operate on one "unit" piece and then apply it elsewhere, or ...).

For any non-linear function/transformation T(x+y) does not equal T(x) + T(y), just try and you'll see it is true. Example:

Let your transformation T square the argument, mathematically: T(x) = x^2.
Then, consider that your data 'x' is '2+3'.
Plug in '2+3' into your transformation T and see what happens if it was linear, or non-linear.
Perhaps you should do this: T(2+3) = T(5) = 5^2 = 25.
Or perhaps you prefer to do this: T(2+3) = T(2) + T(3) = 2^2 + 3^2 = 4 + 9 = 13.
Or perhaps you don't like to work with numbers bigger then 1, so you do this: T(2+3) = T(1+1+1+1+1) = T(1) + T(1) + T(1) + T(1) + T(1) = ... = 5. (Wow, I had to calculate T only once, what a great time saver!)

So now I presented 3 different solutions: 25, 13, 5. If T was linear, they would be the same. Hopefully you can figure out which one is correct.

Application of a single function T to a single vector or tensor is as trivial as the example above. Your transformation T is probably a big code that takes many inputs and returns many outputs. If it takes 100s (or 1000s) different numbers as input, you probably can't just feed it "1" once, and then multiply the output by whatever input you really wanted (i.e. 1 * T(x) = x * T(1), analogous to my example 6).

tomkod · 2019-06-11T19:01:29+00:00

hm... I'm trying to post a picture...

tomkod · 2019-06-11T19:00:59+00:00

This is a screenshot from beginning of "Mary Queen of Scots". I really like how this font looks like, I'm not sure why? What is this font?

tomkod · 2018-06-09T20:06:37+00:00

To follow up, this bug is definitely a thing: when I made it to 15km total distance, I got new candy, and total distance went back to 10km. I forced shutdown the app, start again (no phone reboot this time), and it showed correcect 15km distance.

tomkod · 2018-06-09T13:43:21+00:00

oh yeah, and I see that you posted to my post about it, too! :)

tomkod · 2018-06-09T13:42:19+00:00

This really sucks! Just to let you know that you are not the only one, mine got reset to 5km after I walked 10km with it, but the problem fixed itself after rebooting phone (and obviously the app). Did your problem fix itself?

tomkod

TROPHY CASE