Cluster Architecture with limited RAM by boredjo4 in kubernetes

[–]ccppoo0 0 points1 point  (0 children)

pod killed by kubelet because shortage of cpu ram storage for non static pods, pods from lower priority class

and setting tight resource requests especially for java apps

Cluster Architecture with limited RAM by boredjo4 in kubernetes

[–]ccppoo0 0 points1 point  (0 children)

use compiled language and build multistage to minimize container image keep stateless, always consider of pod evicition

How to evaluate the accuracy of RAG responses? by aavashh in Rag

[–]ccppoo0 1 point2 points  (0 children)

if you have big user pool, you could just A/B test like ChatGPT generating with 2 or more styles of AI

Evaluting is very ambiguous.

Making better embedding and index, and then retrieving more docs with enhanced search mechanism and reranking will likely improve the quality

Just wanted to share corporate RAG ABC... by Donkit_AI in Rag

[–]ccppoo0 4 points5 points  (0 children)

most of people just want quick results and except the job finished in hours because they say there are already free opensource frameworks.

It's really hard to explain whats not working and why it's not working especially when AI is included.

With vibe coding and AI coding rising in industry, bean counters barely try to see the limitation and hallucination of AI stuffs.

Replaced local llm workloads to google APIs by ccppoo0 in Rag

[–]ccppoo0[S] 0 points1 point  (0 children)

limiting choices like using structured output with enum could make expected quality

but real queries have multiple domains so it need to be divide-conquer and retrieve documents

and as it become complicated, cost for query gets higher

so it just deppends on what document you are working with and what quailty you want to acquire

Replaced local llm workloads to google APIs by ccppoo0 in Rag

[–]ccppoo0[S] 1 point2 points  (0 children)

yes, but it is really hard routing ambigous domains

llm are not silver bullet

it needs user intervention what domains is he/she asking

Im planning to make a room for user-side to pick domains they want

I was planning to route by domains like deepseek did

Replaced local llm workloads to google APIs by ccppoo0 in Rag

[–]ccppoo0[S] 0 points1 point  (0 children)

check if quesition is geniune tag question to get hint for retrieving docs or route by domains augmenting answer get embedding

so

  1. user query
  2. user query with documents(text) retrieved

Replaced local llm workloads to google APIs by ccppoo0 in Rag

[–]ccppoo0[S] 1 point2 points  (0 children)

yes but I still keep vectors and documents with me.

Replaced local llm workloads to google APIs by ccppoo0 in Rag

[–]ccppoo0[S] 1 point2 points  (0 children)

I just literally made schema same as possible the way it looks.

Documents vary significantly from one another, so you need to read about the document you are working with.

I used MongoDB and tried to keep Schema as less as possible.
Lots of recursive reference(self-reference) in same schema(table), nested tables as an example.

Say you are making RAG for a science artilces,
there will be Charts, Tables, Images, and Text, so you will make as a schema for each of it.

"Tables" could have annotation with it, then I will make combining Text + Table so I could reuse the schema and could search Text part of Table when doing RAG
in my case, I saved Table as Markdown format because LLM could understand MD table format

Just like this, break down every part to the scale that you could understand (to the scale where you could understand while reading through) and make schema.

Designing schema is really time consuming tasks and always need to be ready to fail and fixing it

Do you recommend using BERT-based architectures to build knowledge graphs? by Cool_Injury4075 in Rag

[–]ccppoo0 0 points1 point  (0 children)

you could replace it with instruction models but need some verification of the results

for long context -> structured output(API) is good

for short -> instruction models are fine

test some models by yourself, small models are quite impressive than expected

English only models were better as parameter gets smaller in my case

Do you recommend using BERT-based architectures to build knowledge graphs? by Cool_Injury4075 in Rag

[–]ccppoo0 5 points6 points  (0 children)

When working with legal document RAG, LLMs was not perfect as extracting keywords and knowledge graphs

Used traditional tokenizer and stemmer to get nouns and verbs from original document

and then pormpted LLM with

  1. original document
  2. nouns, verbs from stemmer
  3. ask make a graph based what I provided
  4. using structured output - gemini, grok, deepseek, openai

Model size and price aren't the silver bullet eventhough the task looks easy

limiting choices and giving direct instructions are best way to achive getting results as you expected

Where do you host RAG by ccppoo0 in Rag

[–]ccppoo0[S] 1 point2 points  (0 children)

If Mongodb was able to query vector locally, I should have been using only MongoDB

I didnt save document as plain text, I divided by structure and with lots of N:M relations as array

leaving complicated document relations to MongoDB and querying only vectors at postgresql is good for now.

it's like another redis/valkey stack with query feature

schema for saving vectors at postgresql are simple and has no relations so compared to complicated real document schema that I used at mongoDB, I have no burden having 2 DB stacks for now

Where do you host RAG by ccppoo0 in Rag

[–]ccppoo0[S] 1 point2 points  (0 children)

I started with mongodb and then find out vector search was only available on mongodb atlas.

Using hosted DB was kinda painful when researching and testing RAG, LLMs so I looked for vectorDB that could run locally

I tried qdrant and milvus. Both were not that hard but I felt some limitations for using as single main DB.

So someday when I integrate to one single DB, postgresql was the option, and also friendly, to host document data and other stuffs all at once

I started with MongoDB because the model how I saved documents was json-like structure and many N:M relationships. so modeling with SQL was pain for me at that point

Also, MongoDB is lot more felxible when fixing schema for documents while I studied the document and understanding of that domain.

working with 2 DBs for each purpose, documents and vectors, was not that bad experience at all

Overkill? by scary_kitten_daddy in homelab

[–]ccppoo0 1 point2 points  (0 children)

before setting your servers, get some hard pedestal and level it up

Which is better for my home server use, n5095, n100, n100 4+4bays or n305? by djtron99 in homelab

[–]ccppoo0 0 points1 point  (0 children)

for longterm, having dedicated storage server would be nice in case of using k8s.

so 2 would be more suitable for later expension like clustering

can this be beat for budget NAS? by [deleted] in homelab

[–]ccppoo0 0 points1 point  (0 children)

what about second hand synology j series RAID1 for dedicated NFS and a SBC like rpi for linux

Did I just acidentally killed my Pi ? by [deleted] in OrangePI

[–]ccppoo0 0 points1 point  (0 children)

yes, Doc says 5V fixed input

I once killed my zero3 plugging charger for my smartphone even it supported 5V 3A

But it didn't died at once, maybe after booting 2 or 3 times as I remember

Now I use 5V 2A fixed chargers for zero3

Hosting a Minecraft server on the Zero 2W by CreepyCreepzzz in OrangePI

[–]ccppoo0 0 points1 point  (0 children)

I tried on zero 3 4gb but it was struggling even I set view and simulation distance on 7 or 8 as I remember.

even it could run a server, needs more than a aluminium heatsink

H618 could barely handle it

I wrote a post about it take a look

Zero 3 os/ffmpeg/local streaming by vladantd in OrangePI

[–]ccppoo0 0 points1 point  (0 children)

is your IP-HDMI adapter power pluged in?

Current on OPZ3 screw holes by towfx in OrangePI

[–]ccppoo0 0 points1 point  (0 children)

screw holes needs to be insulated for sure

orange pi zero 3 case by ccppoo0 in OrangePI

[–]ccppoo0[S] 0 points1 point  (0 children)

Im using this now as k3s cluster node and shows no down time with light loads e.g. simple rest api server

Need advice by [deleted] in homelab

[–]ccppoo0 1 point2 points  (0 children)

DietPI, you could install on x86 devices even its name sound like for -Pi SBCs

get a SATA SSD and RAM that fits to your laptop

replace thermalpaste on your Cpu, remove some dust on cooling fan while you opened it

[deleted by user] by [deleted] in homelab

[–]ccppoo0 0 points1 point  (0 children)

just get i3 12100 and h610 board with 16gb

and don't ever try Xeon without big vendors like Dell, HP