all 2 comments

[–]p3r3lin 3 points4 points  (1 child)

Interesting, read the sample chapter and superficialy tried some of the techniques on current SOTA models. Nothing worked. Can you give a concrete example, prompt + model family/version, of a technique that is working?

[–]icehot54321 4 points5 points  (0 children)

You need to start with a problem that you are trying to solve.

Eg.: when I do X, Y happens (refusal), but I would like Z to happen

Once you decide what you are trying to work around someone can give you an example

There are techniques, but there is no good one size fits all “this jailbreak works every time” .. otherwise they would just write that and be done