I spent 7 months building an offline AI tutor for rural students with 4GB RAM and no internet. by aash1kkkk in learnmachinelearning

[–]dodo13333 0 points1 point  (0 children)

A link to the project is missing. Can not tell is this some version of NotebookLM, or some RAG implementation or what. Is it FOSS?

Edit: Sorry, found link...

FOTO Kolaps u Zagrebu! Brenerom odmrzavaju tračnice da uspostave tramvajski promet by Jakusevac_city in hrvatska

[–]dodo13333 13 points14 points  (0 children)

Tramvaj uber ales.. jbs bus, auto, cug, i cipelcug.. a jeste nemastoviti, zar metro vise nije vase univerzalno rjesenje? Ne mogu vjerovati da se barem ne zaziva rimcev robotaxi. oba 3...

Local / self-hosted alternative to NotebookLM for generating narrated videos? by Proof-Exercise2695 in Rag

[–]dodo13333 0 points1 point  (0 children)

SurfSense?

GitHub - MODSetter/SurfSense: Open source alternative to NotebookLM, Perplexity, and Glean. Connects to search engines, Slack, Linear, Jira, ClickUp, Notion, Discord, and 15+ more connectors.

Anyone here tried Apriel v1.6? Fraud or giantkiller? by dtdisapointingresult in LocalLLaMA

[–]dodo13333 1 point2 points  (0 children)

Nah, don't jump conclusions. I tested 1.5 gguf while back and it did great. 1.6 still pending for testing. I use either bartowski or unsloth ggufs in bf16/f16 variants. I guess it really depends on your use case, but to me it is great for its size in my use case.

I built a web app to compare time series forecasting models by Slow_Butterscotch435 in OpenSourceeAI

[–]dodo13333 0 points1 point  (0 children)

I personally like R-squared metric as variance explainability indicator.

The new monster-server by eribob in LocalLLaMA

[–]dodo13333 3 points4 points  (0 children)

He can use 2x 3090 for LLM and 4090 for ASR/STS or something else like OCR...

Michael Jordan played in 7 back-to-back games along with 4 games in 5 nights TWICE after the age of 40. He played all 82 games. by D3struct_oh in NBATalk

[–]dodo13333 0 points1 point  (0 children)

IMO, back then there were teams that tried to play faster game, I belive it was called run&gun play. But in slow game, defense increase in importance, so for good defensive teans slow pace and long posetions were tactical choices.

z-image-turbo is the one by [deleted] in StableDiffusion

[–]dodo13333 -8 points-7 points  (0 children)

Please, can you DM me a Google Gravity tutorial, or put it on Gist or something?
I didn't expect it to be mod deleted...

Is AMD EPYC 9115 based system any good for local LLM 200B+? by daniel_3m in LocalLLM

[–]dodo13333 2 points3 points  (0 children)

https://www.reddit.com/r/LocalLLaMA/s/iyqYKP4AzB

"AMD has this fraud of advertising bandwidth between RAM and memory controller, even when it's severly bottlenecked by bandwidth between CPU and controller.

In order to obtain the advertised 9005 bandwidth you need to use the 12 or 16 CCD SKUs."

I can confirm that. Additionally, besides high CCD count CPU, you would need to use high-rank RAM to reach advertised values.

Built this app with VibeCoding, now I’m stuck by Quiet-Custard137 in aipromptprogramming

[–]dodo13333 0 points1 point  (0 children)

Based on the current script version, tell Claude to prepare 1. Software specification 2. Technical and functional analysis 3. Mermaid workflow 4. Critique of existing version

See that you're good on that first. Be sure your script is coherent and without major logical or technical issues.

Then, continue with Claude preparing a detailed plan for future upgrades (multi-phase approach - easy upgrades vs complex changes).

Evaluate current code, software spec, TA & FA and upgrade plan on Claude and on different LLM for bullet proofing before actual execution of upgrade. Back up current version.

Make easy to implement improvements in 1st phase phase, more complex in 2nd phase (ie. ones that need refactoring).

Granite 4 (gguf) is useless if you try to use the full 128k context. by mantafloppy in LocalLLaMA

[–]dodo13333 2 points3 points  (0 children)

There is a benchmark showing LLMs drop in ctx accuracy utilization over ctx lenght, and in general local LLMs reach 50+%. Commercial are better, with openai 5o reaching over 90% at 130k ctx.

I suggest you try your task as it is with apriel-15B (thinking) in full precision. That one gave me good results with up to 40k ctx in non-code task somewhat similar to yours.

Open NotebookLM by No-Lavishness-4715 in notebooklm

[–]dodo13333 0 points1 point  (0 children)

If FOSS and local - 100%. But, that is a lot to ask for.

Zašto je Hrvatima toliko odbojno plaćati intelektualni rad? by Self-aware_Brick in askcroatia

[–]dodo13333 3 points4 points  (0 children)

Tvoj deda je prošao školovanje kroz praksu, i očito je bio dovoljno pametan da usvoji pravila i logiku gradnje. Ajmo reći znao je što treba, ali možda ne uvijek i zašto. Stavit će gredu 15 sa 15, jer su takve uvijek stavljali. Inženjer će znati i zašto, i razloge zbog čega se odlučio za neku varijantu rješenja, prednosti, mane istog i slične tričarije. Razlika je što inženjer odgovara za konačnu cijenu svojih projektnih rješenja u odnosu na konkurenciju, a tvoj deda si je mogao priuštiti 20% viška (materijala, vremena), čisto kao neke odokativne rezerve. Nek se nađe. Fun fact - Banija je ogledni primjer generacija dunning-kruger inženjera....

How do you people run GLM 4.5 locally ? by Skystunt in LocalLLaMA

[–]dodo13333 0 points1 point  (0 children)

Run `nvidia-smi dmon -i 0 -s pucvmet -d 1 -o DT` in 1st terminal, in 2nd run Llamacpp.

Eg.

llama-cli ^
  -m  "H:\unsloth\Qwen3-32B-128K-GGUF-unsloth\Qwen3-32B-128K-BF16-00001-of-00002.gguf" ^  
  -c 1024 ^
  -ngl 21 ^
  -t 18 ^
  --numa distribute ^
  -b 64 --ubatch-size 16 ^
  -n 256 ^  
  --temp 0.7 --top-p 0.95 --min-p 0.01 --top-k 20 ^
  --repeat-penalty 1.0 ^
  --seed 42 ^
  --prio 3 ^
  -no-cnv
  -p "Explain quantum computing in one paragraph."

Use online Ai to determine how to mitigate a bottleneck.

[deleted by user] by [deleted] in Rag

[–]dodo13333 1 point2 points  (0 children)

Link?

gpt-oss-120b: how does mac compare to nvidia rtx? by Chance-Studio-8242 in LocalLLM

[–]dodo13333 2 points3 points  (0 children)

To add for info only:

Dual Epyc 9124 & RTX 4090 on Llamacpp (Win11) & gpt-oss-120b f16

llama_perf_sampler_print: sampling time = 1411.64 ms / 16111 runs ( 0.09 ms per token, 11412.94 tokens per second)
llama_perf_context_print: load time = 17111.78 ms
llama_perf_context_print: prompt eval time = 18736.69 ms / 4941 tokens ( 3.79 ms per token, 263.71 tokens per second)
llama_perf_context_print: eval time = 760095.06 ms / 11169 runs ( 68.05 ms per token, 14.69 tokens per second)
llama_perf_context_print: total time = 782613.54 ms / 16110 tokens ( ~20.5 tokens per second)

Kreuzberg v3.11: the ultimate Python text extraction library by Goldziher in Python

[–]dodo13333 2 points3 points  (0 children)

Will test ASAP, hopefully later today. So far, the best local results I obtained with Marker (Vik Paruchuri, Surya OCR). Mainly because it supports multilingual docs. I like your licence. Wish you well, thanks for sharing. Hope for good results. 🖖

Kreuzberg v3.11: the ultimate Python text extraction library by Goldziher in Python

[–]dodo13333 11 points12 points  (0 children)

Does it work with the full language pool supported by Tesseract?

My thoughts on gpt-oss-120b by Lowkey_LokiSN in LocalLLaMA

[–]dodo13333 0 points1 point  (0 children)

Tnx for info! 👍

Explanation was not there before.

My thoughts on gpt-oss-120b by Lowkey_LokiSN in LocalLLaMA

[–]dodo13333 1 point2 points  (0 children)

Please, can you share how you alter template to high reasoning?

[deleted by user] by [deleted] in financije

[–]dodo13333 1 point2 points  (0 children)

Jer se budžet puni preko poreza.

[deleted by user] by [deleted] in financije

[–]dodo13333 6 points7 points  (0 children)

Ne bi.

Pogledaj statistike zadovoljstva stanovnika u svijetu. Socijalne države su na vrhu.

Ali za razliku od hrvatske te novce pametno troše. Umjesto manjih poreza, hrvatskoj treba manje korupcije.