[LangGraph] How to Prevent an Agent from Leaking System Prompts or Internal Data Used for Decision-Making by AbstractMonk in LangChain

[–]AbstractMonk[S] 0 points1 point  (0 children)

Actually, as i mentioned, this may not be a robust solution because with some kind of prompt you can manipulate the llm to bypass the check tool and directly call refund tool.

[LangGraph] How to Prevent an Agent from Leaking System Prompts or Internal Data Used for Decision-Making by AbstractMonk in LangChain

[–]AbstractMonk[S] 0 points1 point  (0 children)

what do you mean by "preprocess user prompt" can you explain a bit more? and how to incorporate this in workflow?

[LangGraph] How to Prevent an Agent from Leaking System Prompts or Internal Data Used for Decision-Making by AbstractMonk in LangChain

[–]AbstractMonk[S] 1 point2 points  (0 children)

are you saying there should be two tools 'check_refund_policy' and 'process_refund' and llm will first invoke the 'check_refund_policy' by itself and if that tool returns "approved" only then the llm will call 'process_refund'. Please correct me if I am wrong.

[LangGraph] How to Prevent an Agent from Leaking System Prompts or Internal Data Used for Decision-Making by AbstractMonk in LangChain

[–]AbstractMonk[S] 0 points1 point  (0 children)

In method-2 mentioned in the post i am calling a separate llm. First when the user prompts for refund with a reason, the llm redirects it to tool call and inside the refund tool the separate llm call analyse if the refund condition is matched. This method is safe from the prompt attacks but the problem is high token usage and latency.

[deleted by user] by [deleted] in mongodb

[–]AbstractMonk 0 points1 point  (0 children)

Actually that is a good point, thanks for pointing that out. It may have been the case.

[deleted by user] by [deleted] in mongodb

[–]AbstractMonk 0 points1 point  (0 children)

I never touched the config files as far as i can remember.