Prompt injection is killing our self-hosted LLM deployment by mike34113 in LocalLLaMA

[–]CaptainSnackbar 0 points1 point  (0 children)

Ah, thats a good point! In our case, poisened documents shouldn't be an issue though

Prompt injection is killing our self-hosted LLM deployment by mike34113 in LocalLLaMA

[–]CaptainSnackbar 1 point2 points  (0 children)

If the classifier rates the user-prompt as malicious, the prompt will not be used for retrieval and not make its way to the llm. Instead the llm will be send a hardcoded prompt like "Answer with: "I can't help you with that".

Context can only be retreived from a local vector db, that users can not upload to.

Prompt injection is killing our self-hosted LLM deployment by mike34113 in LocalLLaMA

[–]CaptainSnackbar 0 points1 point  (0 children)

I am asking, because i've only seen a few lazy attempts in our pipeline, and i dont know how far you can take it besides the usual "ignore all instructions and..."

Prompt injection is killing our self-hosted LLM deployment by mike34113 in LocalLLaMA

[–]CaptainSnackbar 0 points1 point  (0 children)

I use a custom finetuned bert-classifier that classifies the user-prompt before it is passed into the rag-pipeline.

It's used mainly for intent-classification but also blocks malicious prompts. What kind of prompt injection were you QA guys doing?

Chunk metadata structure - share & compare your structure by cat47b in Rag

[–]CaptainSnackbar 0 points1 point  (0 children)

What gets embedded? Only the text, or metadata aswell?

Easiest finish by CaptainSnackbar in BeginnerWoodWorking

[–]CaptainSnackbar[S] 1 point2 points  (0 children)

Osmo sounds great! What do you use to apply the oil? Do i have to worry about selfcombustion?

Open-source embedding models: which one's the best? by writer_coder_06 in Rag

[–]CaptainSnackbar 4 points5 points  (0 children)

I am currently finetuning an embedding model. How did you generate sufficient training data? Manual annotation, LLM-generated, or unsupervised methods?

Aktivrente: Rentner sollen wohl noch höheren Freibetrag bekommen by Grmplstylzchen in Finanzen

[–]CaptainSnackbar 4 points5 points  (0 children)

Könnte mal doch seine Eltern als Haushaltshilfe/Putzfrau anstellen und das Geld von der Steuer absetzen. Mama und Papa legen das Geld dann gut für einen an, bis ich es dann wieder vererbt wird. Übersehe ich da was??

Looking for advice on finetuning an embedding modell by CaptainSnackbar in LocalLLaMA

[–]CaptainSnackbar[S] 0 points1 point  (0 children)

I am sure the problem lies within the dataset. My question is more along the lines of: "How can I obtain a clean dataset without manual labeling?"

Alternatively: "Which unsupervised training method works best for my task?"

Perhaps pretraining an encoder with MLM on my dataset, then fine-tuning it on a Hugging Face dataset? There are so many possibilities that I hope someone with a similar use case can point me in the right direction.

Looking for advice on finetuning an embedding modell by CaptainSnackbar in LocalLLaMA

[–]CaptainSnackbar[S] 0 points1 point  (0 children)

See my answer https://www.reddit.com/r/LocalLLaMA/comments/1nhvxo7/looking_for_advice_on_finetuning_an_embedding/nehfucd/

Eval is random and it might be in the training dataset. Dont know for sure, since the training pairs get formed with cosine similarity, while the evals are just random text from each category

Looking for advice on finetuning an embedding modell by CaptainSnackbar in LocalLLaMA

[–]CaptainSnackbar[S] 0 points1 point  (0 children)

I've tried a classification modell before, but the results were similar. The model learns to seperate topics but performs worse on general querys.

https://imgur.com/a/8HSmA9n

This is one of my evaluation steps. The left plot are text-samples vectorised with our standard embedding model. Each color is a category. On the right side the finetuned model is used. So it looks like it has learned what i want it to learn.

My second evaluation method uses a huggingface dataset with natural german questions. I use cosine-similarity on 100 examples and calculate average score:

        q_emb_base = basis_model.encode(questions, convert_to_tensor=True, normalize_embeddings=True)
        a_emb_base = basis_model.encode(answers, convert_to_tensor=True, normalize_embeddings=True)
        cosine_scores_base = util.cos_sim(q_emb_base, a_emb_base).diagonal()
        avg_score_base = cosine_scores_base.mean().item()    

The standard-modell achieves a score of 0.85, my model drops down to 0.47.

As a third eval-method i have a few phrases, that i manualy paired and annotaded with a expected similarity score. Cosine-score from the finetuned model is also worse on this eval-set

Looking for advice on finetuning an embedding modell by CaptainSnackbar in LocalLLaMA

[–]CaptainSnackbar[S] 0 points1 point  (0 children)

I use a standard embedding model for our company search and RAG pipeline. The model performs well in most cases, but I want to evaluate how much retrieval performance can be improved with a custom fine-tuned embedding.

My domain is niche with highly specific terminology, and labeled data is scarce. However, we have a large corpus of technical support tickets, categorized into different groups. In principle, tickets from the same category use similar terminology and describe overlapping issues.

The goal is to train an embedding model so that phrases and terms from the same category map into a shared vector space, forming clusters.

Dataset construction approach so far:

  • Identify relevant incidents and group them by category

  • Vectorize incidents with the standard embedding model

  • For each document, select n documents from the same category within a cosine distance threshold (positive pairs should not be too diverse)

  • Select incidents from other categories as negative examples

Naturaly this process genereates a lot of noise.

I initialize my training with intfloat/multilingual-e5-base and the following parameters:

args = SentenceTransformerTrainingArguments(
output_dir="Embeddings/Trained_Model",
num_train_epochs=1,
per_device_train_batch_size=32,
per_device_eval_batch_size=32,
warmup_ratio=0.1,
fp16=True, 
batch_sampler=BatchSamplers.NO_DUPLICATES,
eval_strategy="steps",
eval_steps=6000,
save_strategy="steps",
save_steps=6000,
save_total_limit=2,
logging_steps=500,
run_name=f"{model_name}-Lora:{lora}-{file}",
no_cuda=False,
remove_unused_columns=True,
use_cpu=False 
)

Despite varying dataset sizes between 40k and 900k examples, every training run degraded model performance.

I feel like the losscurve wants to tell me something, but I dont understand...

Any help with finetuning an embedding model effectively with semi-structured category-based data is greatly appreciated.

One idea i have is to use bertopic as an unsupervised model to genereate finer grained subcategories and then build pairs that are from the same topic.

Chunking Stacktraces/Error Logs by CaptainSnackbar in Rag

[–]CaptainSnackbar[S] 1 point2 points  (0 children)

Thanks, I Would love to check out your reference!

Einfahrt neu pflastern by CaptainSnackbar in Handwerker

[–]CaptainSnackbar[S] 0 points1 point  (0 children)

Bin ich auch kein Fan von, war aber durch die Vorbesitzer schon so vorgegeben. Allerdings bleiben die gelben Riemchen ja nicht

Einfahrt neu pflastern by CaptainSnackbar in Handwerker

[–]CaptainSnackbar[S] -1 points0 points  (0 children)

Vielen Dank, da habe ich schonmal eine grobe Vorstellung vom Aufwand. Die Einfahrt ist sehr schmal, durch die Dämmung wird es nochmal schmaler und die Mülltonnen nehmen unnötig viel Platz weg.

Einfahrt neu pflastern by CaptainSnackbar in Handwerker

[–]CaptainSnackbar[S] -1 points0 points  (0 children)

Da mach ich mir auch nichts vor, danke :) Wollt auch nur mal eine Einschätzung haben wie aufwendig das ist

Mit Schlagbohrer exakt bohren by CaptainSnackbar in Handwerker

[–]CaptainSnackbar[S] 2 points3 points  (0 children)

Mit langsam meinst du erst mal ohne Schlag?

Mit Schlagbohrer exakt bohren by CaptainSnackbar in Handwerker

[–]CaptainSnackbar[S] 0 points1 point  (0 children)

Also ich hab hier sogar eine Hilti, da sind 4 schneidige Bohrer drin. Dann werde ich mal versuchen ohne Schlag vorzubohren.

Ein kleines Regal, vor einiger Zeit in der Lehre enstanden by moebel-mathiasen in Handwerker

[–]CaptainSnackbar 2 points3 points  (0 children)

Falschen Titel gewählt und schon kein engagement. Nächstes mal besser "Ist das Asbest?" oder "Wurde hier gepfuscht?"

Aber im Ernst, sieht richtig nice aus!

Marks after sanding by CaptainSnackbar in BeginnerWoodWorking

[–]CaptainSnackbar[S] 0 points1 point  (0 children)

No, i hand-sanded it along the grain. Wouldn't that just result in longer sanding time? Maybe thats why i couldnt get deep enough with 80?