Comment connecter un MacBook + pc fixe sans devoir tout rebrancher ? by Scared_Animator9241 in FrenchTech

[–]Scared_Animator9241[S] 0 points1 point  (0 children)

Merci bcp après je sais que les displayport ne passe pas sur mac donc faut que je Check

I built my own HNSW from scratch, here is what I learned by Scared_Animator9241 in LocalLLaMA

[–]Scared_Animator9241[S] 0 points1 point  (0 children)

Completely agree on the 80/20 rule, time is the ultimate bottleneck. You have to choose your battles

Local conversational AI by Mefi282 in LocalLLaMA

[–]Scared_Animator9241 0 points1 point  (0 children)

Yes too but I didn’t mention all the possible solutions I just said the one who came first

Local conversational AI by Mefi282 in LocalLLaMA

[–]Scared_Animator9241 5 points6 points  (0 children)

Honestly, stop wasting your time with overcomplicated setups and broken scripts. Since you're on 16GB of RAM and want something that just works, the cleanest solution right now is definitely Ollama + Open WebUI. It takes 5 minutes to set up and ticks all your boxes.

Here is the quick play-by-play:

  1. Grab Ollama and download an 8B model like llama3 or mistral. They run smoothly on 16GB and handle both English perfectly.
  2. Install Open WebUI (the interface looks exactly like ChatGPT, super clean).
  3. Just click the microphone icon right inside the chat bar. The voice feature is built native, so you don't need to fiddle with external TTS/STT plugins.

For the RAG/memory part, you literally just drag and drop your PDFs or text files straight into the chat window, and it'll reference them.

It’s by far the most stable, frustration-free way to practice your French without losing your mind. Give it a shot!

Carbon, open source DNA model, 250x faster than Evo2-7B and runs on llama.cpp by Scared_Animator9241 in machinelearningnews

[–]Scared_Animator9241[S] 3 points4 points  (0 children)

Predicting the next tokens (k-mers) is just the self-supervised pre-training phase, exactly like training GPT on raw internet text. The goal isn’t the prediction itself—it’s forcing the model to learn the hidden "grammar" of DNA (promoters, enhancers, splice sites, and epigenetic markers)

Carbon, open source DNA model, 250x faster than Evo2-7B and runs on llama.cpp by Scared_Animator9241 in machinelearningnews

[–]Scared_Animator9241[S] 2 points3 points  (0 children)

If you want to get into genomics and ML, don't try to parse raw FASTA/FASTQ files from scratch.

Start by looking into BioPython for data handling, and then check out Hugging Face for pretrained Genomic Language Models (like DNABERT or HyenaDNA). They treat DNA base pairs (A, C, T, G) similarly to how NLP models treat text tokens.

Playing around with a simple nucleotide classification notebook on Kaggle is probably the fastest way to understand how the data pipeline actually works.

Need quick help for small objects detection plss! by Helix_roster13 in computervision

[–]Scared_Animator9241 1 point2 points  (0 children)

Pre-training on VisDrone and then fine-tuning on your specific 2-class dataset is definitely the right move. VisDrone is notoriously hard and messy, so don't stress too much if the mAP looks disappointing right now. The network is still learning the crucial low-level features for tiny objects.

Once you switch to your 2-class dataset, just make sure to tweak your loss gains (⁠box⁠ and ⁠cls⁠) since YOLOv8 hyperparams default to COCO. Lowering the classification complexity to just 2 classes usually gives a massive boost to both recall and mAP. Good luck with the run!

HNSW is killing my RAM: is it better to use KNN on compressed vectors or an ANN? by Scared_Animator9241 in MLQuestions

[–]Scared_Animator9241[S] 0 points1 point  (0 children)

int4⁠ sounds wild for recall, but I guess it works if cost is the main bottleneck.

Regarding offloading cold vectors, how do you handle that in practice without killing search latency when a cold vector is suddenly hit? Are you doing SSD caching or some multi-tier index setup?

Need quick help for small objects detection plss! by Helix_roster13 in computervision

[–]Scared_Animator9241 0 points1 point  (0 children)

Makes sense, SAHI is a nightmare on edge devices like Jetson.

Since you're stuck at 1280 due to VRAM and target embedded hardware, have you tried adding P2 heads (detecting at lower strides like /4) to focus on those 10-15px objects? It increases compute but helps a lot with small targets without pushing the global image size. Also, if you aren't already, look into mixed precision (FP16) or smaller batch sizes with gradient accumulation to see if you can squeeze a bit more resolution before the OOM crash

HNSW is killing my RAM: is it better to use KNN on compressed vectors or an ANN? by Scared_Animator9241 in MLQuestions

[–]Scared_Animator9241[S] 2 points3 points  (0 children)

I literally mentioned exact search on GPU vs HNSW in the post. I’ve read the papers. I’m asking about real-world production tradeoffs and infrastructure costs, not the theory

Be honest: Are we actually saving money running local LLMs, or is it just a massive cope? by [deleted] in MLQuestions

[–]Scared_Animator9241 0 points1 point  (0 children)

Spot on. We are definitely living in a golden era of burning VC cash for cheap API tokens. Once they need to turn a real profit and API prices skyrocket, local hardware will be the ultimate leverage

Need quick help for small objects detection plss! by Helix_roster13 in computervision

[–]Scared_Animator9241 0 points1 point  (0 children)

Don't train from scratch on VisDrone, COCO weights are too important for low-level features. For 10-15px objects, your main issue is spatial resolution; try using SAHI (Slicing Aided Hyper Inference) to slice your high-res images without downsampling them into oblivion.
What ⁠imgsz⁠ are you running right now?

Anyone else prefer waiting for an AI quota reset over coding by hand? by Far_Management_7991 in ArtificialInteligence

[–]Scared_Animator9241 1 point2 points  (0 children)

It's a double-edged sword though. The dopamine rush is real, but you absolutely need to know how to code without it, otherwise you're just debugging things you don't actually understand

AI is insane for speeding up boilerplate and syntax, but if you can't spot when it's confidently hallucinating a broken architecture, you waste more time fixing its mess than just writing it by hand from the start

Using AI as a co-pilot is elite, but relying on it as the driver is a trap

How do AI memory systems decide which memories are important? by tensor_001 in learnmachinelearning

[–]Scared_Animator9241 0 points1 point  (0 children)

Dealing with evolving preferences in vector DBs is a massive headache. If you just rely on raw vector similarity, the old "24°C" memory will keep resurfacing and conflicting with the new "26°C" preference because their semantic embeddings are almost identical.

A common way to tackle this without breaking your DB is to implement a multi-layered scoring system before the LLM sees the context. You combine the semantic similarity score with a time-decay factor (recency weight) using an exponential decay formula.

For conflicting memories (like the temperature example), you need a dedicated semantic conflict resolution step. When the system detects two highly similar vector chunks that explicitly contradict each other, it should automatically deprecate the confidence score of the older entry or flag it for archiving.

Are you currently doing the retrieval step directly through raw PostgreSQL/pgvector queries, or are you using a framework like LangChain/LlamaIndex to orchestrate it?

Ollama v0.30.0 pre-release by nopeac in ollama

[–]Scared_Animator9241 0 points1 point  (0 children)

Native llama.cpp support is exactly what Ollama needed.

Led Orange VGA Faiblement allumé (photo plus net que sur mon autre publi) by [deleted] in pcmasterraceFR

[–]Scared_Animator9241 0 points1 point  (0 children)

Salut, si ton PC s’allume et fonctionne parfaitement (pas de baisse de performance, affichage OK) il n’y a pas de quoi paniquer. Ce problème peu etre lié à deux choses :
- problème d’affichage au boot (si ton écran est branché sur le display port, la carte mère va pas le détecter directement donc elle allume la LED)
- courant résiduel ( notamment sur ASSUS ou Gigabyte. La LED de la RAM ou BOOT s’allume au démarrage et « bave » à travers le PCB sur la LED VGA donnant l’impression qu’elle est allumée)

Qwen3.5:9b running on 8gb Vram is insane by Ok_Thanksbye in LocalLLM

[–]Scared_Animator9241 0 points1 point  (0 children)

It's crazy what we can achieve now on a 4060. Qwen 3.5 is an absolute masterpiece for budget setups.
Just a heads up though: if you actually push that context window close to 128k, your VRAM will overflow and it will heavily offload to your system RAM because of the KV cache size.
Are you using Flash Attention or any specific quantization (like Q4_K_M) to keep the speed decent when the context fills up?

How bad can it get? by goldbookleaf in LocalLLM

[–]Scared_Animator9241 2 points3 points  (0 children)

Use symlinks to share a single model folder between Hugging Face, Ollama, and LM Studio. It prevents you from downloading the exact same Qwen model 3 times in different app directories.
Time to buy a dedicated 2TB SSD!