HipoCap uses a multi-stage analysis pipeline to detect and block prompt injection attacks, including indirect prompt injection. This guide explains how each stage works and how to use them effectively.Documentation Index
Fetch the complete documentation index at: https://docs.hipocap.com/llms.txt
Use this file to discover all available pages before exploring further.
What is Prompt Injection?
Prompt injection is an attack where malicious instructions are embedded in content that an LLM processes. This can cause the LLM to:- Execute unauthorized function calls
- Leak sensitive information
- Bypass safety controls
- Perform unintended actions
Multi-Stage Analysis Pipeline
HipoCap uses three stages of analysis to detect prompt injection. Each stage catches different types of attacks, and you can enable them based on your security needs.Stage 1: Input Analysis (Prompt Guard)
Purpose: Detect malicious patterns in function inputs before execution. How it works:- Uses specialized models to analyze function arguments and user queries
- Fast, rule-based detection with low latency
- Checks for suspicious patterns and keywords
- Direct injection attempts in function inputs
- Suspicious patterns in user queries
- Malicious instructions embedded in arguments
Stage 2: LLM Analysis
Purpose: Analyze function results for threat indicators and attack patterns. How it works:- Uses structured LLM analysis with threat detection
- Analyzes the actual content returned by functions
- Detects sophisticated attack patterns
- Threat indicators (S1-S14 categories)
- Technical indicators (instruction_injection, contextual_blending, function_call_attempt)
- Attack patterns and function call attempts embedded in content
Stage 3: Quarantine Analysis
Purpose: Simulate infection by sending content to a quarantine LLM, then analyze the output. How it works:- Sends function result to quarantine LLM (simulates what would happen if malicious content reached your main LLM)
- Analyzes the quarantine LLM’s output for hidden instructions
- Hidden instructions that only trigger when processed by an LLM
- Contextual blending attacks
- Function call attempts that emerge after LLM processing
Attack Vectors Protected
1. Instruction Injection
Direct commands to override system behavior. Example:2. Contextual Blending
Malicious instructions hidden in legitimate content. Example:3. Function Call Attempts
Attempts to trigger unauthorized function calls. Example:4. Hidden Instructions
Instructions encoded or obfuscated in content. Example:Analysis Modes
Quick Analysis
Faster analysis with simplified output:final_decision- “ALLOWED” or “BLOCKED”final_score- Risk score (0.0-1.0)safe_to_use- Boolean indicating if safeblocked_at- Stage where blocking occurred (if any)reason- Reason for decision
Full Analysis
Comprehensive analysis with detailed threat information:threat_indicators- Complete S1-S14 breakdowndetected_patterns- Detailed pattern analysisfunction_call_attempts- Complete function call detectionpolicy_violations- Policy rule violationsseverity- Detailed severity assessment
Function Call Detection
HipoCap specifically detects function call attempts embedded in content: Detected patterns:- Direct commands: “search the web”, “send email”, “execute command”
- Polite requests: “please search”, “can you search”, “would you search”
- Embedded instructions: “search for confidential information”, “look up this data”
Decision Making
Based on the analysis, HipoCap makes one of two decisions (returned asfinal_decision):
ALLOWED
- No threats detected
- All policy rules passed
- Safe to execute
safe_to_use: true
BLOCKED
- Threat detected (S1-S14 category)
- Policy violation
- Function call attempt detected
- High severity risk
- RBAC permission denied
- Function chaining violation
safe_to_use: falseblocked_atindicates which stage blocked it
Complete Example
Here’s a complete example showing all three stages:Best Practices
- Enable All Stages for Critical Functions - Use all three stages for sensitive operations
- Use Quick Mode for Low Latency - Enable quick analysis when speed is critical
- Configure Policies - Set up governance policies to define blocking rules
- Monitor and Review - Regularly review blocked attempts to tune policies
- Combine with RBAC - Use role-based access control alongside analysis
Next Steps
- Threat Categories - Detailed S1-S14 reference
- Setting up the Shield - Configuration guide
- Keyword Detection - Configure keyword detection
