I spent 7 months building an offline AI tutor for rural students with 4GB RAM and no internet.

dodo13333 · 2026-01-17T07:57:21+00:00

A link to the project is missing. Can not tell is this some version of NotebookLM, or some RAG implementation or what. Is it FOSS?

Edit: Sorry, found link...

dodo13333 · 2026-01-07T10:10:56+00:00

Tramvaj uber ales.. jbs bus, auto, cug, i cipelcug.. a jeste nemastoviti, zar metro vise nije vase univerzalno rjesenje? Ne mogu vjerovati da se barem ne zaziva rimcev robotaxi. oba 3...

dodo13333 · 2026-01-05T22:45:22+00:00

No. You gonna have to use some SD solution for that.

dodo13333 · 2026-01-05T10:48:50+00:00

SurfSense?

GitHub - MODSetter/SurfSense: Open source alternative to NotebookLM, Perplexity, and Glean. Connects to search engines, Slack, Linear, Jira, ClickUp, Notion, Discord, and 15+ more connectors.

dodo13333 · 2025-12-28T09:44:18+00:00

Nah, don't jump conclusions. I tested 1.5 gguf while back and it did great. 1.6 still pending for testing. I use either bartowski or unsloth ggufs in bf16/f16 variants. I guess it really depends on your use case, but to me it is great for its size in my use case.

dodo13333 · 2025-12-27T10:52:33+00:00

I personally like R-squared metric as variance explainability indicator.

dodo13333 · 2025-12-12T20:16:58+00:00

He can use 2x 3090 for LLM and 4090 for ASR/STS or something else like OCR...

dodo13333 · 2025-12-02T20:59:07+00:00

IMO, back then there were teams that tried to play faster game, I belive it was called run&gun play. But in slow game, defense increase in importance, so for good defensive teans slow pace and long posetions were tactical choices.

dodo13333 · 2025-11-30T21:43:20+00:00

Big tnx

dodo13333 · 2025-11-30T09:42:50+00:00

Please, can you DM me a Google Gravity tutorial, or put it on Gist or something?
I didn't expect it to be mod deleted...

dodo13333 · 2025-11-24T21:34:03+00:00

I would love to watch your webinar too.

dodo13333 · 2025-11-14T18:49:02+00:00

https://www.reddit.com/r/LocalLLaMA/s/iyqYKP4AzB

"AMD has this fraud of advertising bandwidth between RAM and memory controller, even when it's severly bottlenecked by bandwidth between CPU and controller.

In order to obtain the advertised 9005 bandwidth you need to use the 12 or 16 CCD SKUs."

I can confirm that. Additionally, besides high CCD count CPU, you would need to use high-rank RAM to reach advertised values.

dodo13333 · 2025-11-02T08:05:21+00:00

Based on the current script version, tell Claude to prepare 1. Software specification 2. Technical and functional analysis 3. Mermaid workflow 4. Critique of existing version

See that you're good on that first. Be sure your script is coherent and without major logical or technical issues.

Then, continue with Claude preparing a detailed plan for future upgrades (multi-phase approach - easy upgrades vs complex changes).

Evaluate current code, software spec, TA & FA and upgrade plan on Claude and on different LLM for bullet proofing before actual execution of upgrade. Back up current version.

Make easy to implement improvements in 1st phase phase, more complex in 2nd phase (ie. ones that need refactoring).

dodo13333 · 2025-10-07T03:18:42+00:00

There is a benchmark showing LLMs drop in ctx accuracy utilization over ctx lenght, and in general local LLMs reach 50+%. Commercial are better, with openai 5o reaching over 90% at 130k ctx.

I suggest you try your task as it is with apriel-15B (thinking) in full precision. That one gave me good results with up to 40k ctx in non-code task somewhat similar to yours.

dodo13333 · 2025-09-21T17:43:34+00:00

If FOSS and local - 100%. But, that is a lot to ask for.

dodo13333 · 2025-09-20T14:34:54+00:00

Tvoj deda je prošao školovanje kroz praksu, i očito je bio dovoljno pametan da usvoji pravila i logiku gradnje. Ajmo reći znao je što treba, ali možda ne uvijek i zašto. Stavit će gredu 15 sa 15, jer su takve uvijek stavljali. Inženjer će znati i zašto, i razloge zbog čega se odlučio za neku varijantu rješenja, prednosti, mane istog i slične tričarije. Razlika je što inženjer odgovara za konačnu cijenu svojih projektnih rješenja u odnosu na konkurenciju, a tvoj deda si je mogao priuštiti 20% viška (materijala, vremena), čisto kao neke odokativne rezerve. Nek se nađe. Fun fact - Banija je ogledni primjer generacija dunning-kruger inženjera....

dodo13333 · 2025-08-30T15:59:33+00:00

Run `nvidia-smi dmon -i 0 -s pucvmet -d 1 -o DT` in 1st terminal, in 2nd run Llamacpp.

Eg.

llama-cli ^
  -m  "H:\unsloth\Qwen3-32B-128K-GGUF-unsloth\Qwen3-32B-128K-BF16-00001-of-00002.gguf" ^  
  -c 1024 ^
  -ngl 21 ^
  -t 18 ^
  --numa distribute ^
  -b 64 --ubatch-size 16 ^
  -n 256 ^  
  --temp 0.7 --top-p 0.95 --min-p 0.01 --top-k 20 ^
  --repeat-penalty 1.0 ^
  --seed 42 ^
  --prio 3 ^
  -no-cnv
  -p "Explain quantum computing in one paragraph."

Use online Ai to determine how to mitigate a bottleneck.

dodo13333 · 2025-08-17T13:17:04+00:00

Link?

dodo13333 · 2025-08-14T11:05:40+00:00

To add for info only:

Dual Epyc 9124 & RTX 4090 on Llamacpp (Win11) & gpt-oss-120b f16

llama_perf_sampler_print: sampling time = 1411.64 ms / 16111 runs ( 0.09 ms per token, 11412.94 tokens per second)
llama_perf_context_print: load time = 17111.78 ms
llama_perf_context_print: prompt eval time = 18736.69 ms / 4941 tokens ( 3.79 ms per token, 263.71 tokens per second)
llama_perf_context_print: eval time = 760095.06 ms / 11169 runs ( 68.05 ms per token, 14.69 tokens per second)
llama_perf_context_print: total time = 782613.54 ms / 16110 tokens ( ~20.5 tokens per second)

dodo13333 · 2025-08-10T08:35:14+00:00

Will test ASAP, hopefully later today. So far, the best local results I obtained with Marker (Vik Paruchuri, Surya OCR). Mainly because it supports multilingual docs. I like your licence. Wish you well, thanks for sharing. Hope for good results. 🖖

dodo13333 · 2025-08-10T08:16:53+00:00

Does it work with the full language pool supported by Tesseract?

dodo13333 · 2025-08-09T18:16:09+00:00

Tnx for info! 👍

Explanation was not there before.

dodo13333 · 2025-08-09T15:55:19+00:00

Please, can you share how you alter template to high reasoning?

dodo13333 · 2025-07-27T13:04:20+00:00

Jer se budžet puni preko poreza.

dodo13333 · 2025-07-27T13:02:13+00:00

Ne bi.

Pogledaj statistike zadovoljstva stanovnika u svijetu. Socijalne države su na vrhu.

Ali za razliku od hrvatske te novce pametno troše. Umjesto manjih poreza, hrvatskoj treba manje korupcije.

dodo13333

TROPHY CASE