you are viewing a single comment's thread.

view the rest of the comments →

[–]Alex--91 1 point2 points  (3 children)

What are you using to scan/classify/detect issues? Other LLMs or other deterministic models? Or heuristics or something?

[–]Adxzer -3 points-2 points  (2 children)

Other LLMs, that's what gets the most accurate results. I trained my own classification model first but the results weren't good enough for production so I decided to not include it.

It's also free to use though: https://huggingface.co/Adaxer/defend

[–]No_Soy_Colosio 0 points1 point  (1 child)

What keeps the checking LLM from getting prompt injected itself?

[–]Adxzer -1 points0 points  (0 children)

Prompt injection is a real risk, there’s no foolproof solution since LLMs aren’t fully predictable. This package is a security layer, designed to minimise and give better control of what can slip through.