Back to all blog posts

How do you manage prompt injection and data leakage risks with LLMs?

June 3, 2025
Prompt Injection Mitigation
  • Input Sanitization:

    All user inputs are sanitized using strict allowlists and escaped to remove special characters or tokens that could manipulate prompt context.

  • Prompt Templates:

    We use well-structured prompt templates with static instruction blocks and clearly bounded user input areas. This ensures user content cannot override system behavior.

  • System/User Segregation:

    Clear separation of system-level instructions from user-level content using delimiters or structured JSON formats to reduce the injection surface.

  • Content Filtering:

    We scan both inputs and outputs using keyword-based and AI-assisted filters to block or redact sensitive terms, jailbreak attempts, or malicious payloads.

Data Leakage Protection
  • Minimal Context Principle:

    Only relevant, non-sensitive data is passed to LLMs. PII and business-sensitive inputs are explicitly excluded.

  • Access Control:

    Access to prompt construction logic, logs, and model responses is restricted and audited, especially in multi-tenant or enterprise setups.

  • API Key Security:

    LLM API keys are stored in encrypted secret managers (e.g. Vault, AWS Secrets Manager) and rotated regularly.

  • Output Monitoring:

    Responses from the LLM are post-processed and filtered before delivery to end users to catch hallucinations or confidential data leaks.

    • Rule-based & keyword filters catch profanity, sensitive terms, and format issues.

    • AI moderation tools (e.g. OpenAI Moderation) scan for toxicity or unsafe content.

  • Hosted vs On-Prem Decisioning:

    For high-risk data contexts, we prefer on-prem or private-hosted models, avoiding external LLM APIs entirely.