NVIDIA Technical BlogNovember 15, 2023

Best Practices for Securing LLM-Enabled Applications

View Original

Information leaks can occur when private data used to train or run a language model (LLM) can be inferred or extracted by an attacker.
Prompt injection attacks can be either direct or indirect. Indirect prompt injection relies on the LLM having access to an external data source that an attacker can manipulate to insert malicious content.
Trust boundaries are important, as once an attacker is able to inject their input into the LLM, they can exert significant influence or control over its output.
Explicit user authorization should be requested when a plug-in operates on a sensitive system, and OAuth2 or other secure methods should be used for authorization delegation.
Model inversion and prompt extraction attacks can be used by attackers to access private or sensitive information from the LLM. Avoid sharing unauthorized information in the LLM prompt template and consider using a RAG architecture instead of training on sensitive data.
To reduce the risk of sensitive training data extraction, it is best not to train on it. If necessary, a RAG architecture can provide secure access to sensitive documents without training on them directly.
Logging of prompts and responses can lead to service-side information leaks, so it is important to educate users about not introducing proprietary or sensitive information and to control access to logged data.
When using RAG to improve LLM responses, track user authorization for document retrieval and ensure appropriate access controls are in place for logged responses.