Audit Logs

Tracking, Monitoring & Compliance Logging for LLM Applications

Why Audit Logging for LLMs?

Audit logs provide a tamper-evident record of all activities in your LLM application. They're essential for security investigations, compliance requirements, debugging issues, and understanding how users interact with your AI system.

"For AI systems, audit trails must capture not just who did what, but also the context of AI decisions: what data was used, what model version was active, and what the AI's reasoning process was."

— ISO/IEC 42001 AI Management System

Security

Incident investigation

Compliance

Regulatory requirements

Debugging

Issue troubleshooting

Analytics

Usage insights

What to Log in LLM Applications

Event Category	What to Capture	Why It Matters
Authentication	Login attempts, token issuance, session starts	Security, access patterns
LLM Requests	User ID, model used, prompt hash, token count	Usage tracking, cost allocation
RAG Queries	Documents retrieved, relevance scores	Data access audit, quality analysis
Tool Calls	Tool name, parameters, success/failure	Agent behavior audit
Guardrail Events	Blocked content, triggered rules, severity	Policy enforcement, attack detection
Errors	Error type, stack trace, request context	Debugging, reliability
Admin Actions	Config changes, permission updates	Change management

Recommended Log Structure

# Structured audit log entry
{
    "timestamp": "2024-01-15T10:30:45.123Z",
    "event_id": "evt_abc123xyz",
    "event_type": "llm.completion",
    
    // Who
    "user_id": "user_456",
    "session_id": "sess_789",
    "ip_address": "192.168.1.100",
    "user_agent": "Mozilla/5.0...",
    
    // What
    "action": "generate",
    "model": "gpt-4-turbo",
    "prompt_hash": "sha256:abc123...",  // Hash, not raw text
    "tokens_input": 150,
    "tokens_output": 280,
    
    // Context
    "rag_documents": ["doc_123", "doc_456"],
    "tools_called": ["search_database"],
    
    // Result
    "status": "success",
    "latency_ms": 1250,
    "guardrails_triggered": [],
    
    // Metadata
    "environment": "production",
    "app_version": "2.1.0",
    "trace_id": "trace_xyz789"
}

What NOT to Log

Sensitive Data to Exclude

Raw prompts/responses - May contain PII, use hashes instead
Passwords & credentials - Never log authentication secrets
API keys & tokens - Mask or exclude entirely
PII (names, SSN, emails) - Anonymize or hash
Financial data - Credit cards, bank accounts
Health information (PHI) - HIPAA considerations

Implementation Example

# LLM Audit Logger
import hashlib
import json
from datetime import datetime
from typing import Optional, List, Dict

class LLMAuditLogger:
    def __init__(self, log_backend):
        self.backend = log_backend
    
    def _hash_content(self, content: str) -> str:
        """Hash sensitive content for logging"""
        return hashlib.sha256(content.encode()).hexdigest()[:16]
    
    def log_llm_request(
        self,
        user_id: str,
        session_id: str,
        model: str,
        prompt: str,
        response: str,
        tokens_in: int,
        tokens_out: int,
        latency_ms: float,
        rag_docs: Optional[List[str]] = None,
        tools_called: Optional[List[str]] = None,
        guardrails_triggered: Optional[List[str]] = None
    ):
        """Log an LLM request with proper sanitization"""
        
        event = {
            "timestamp": datetime.utcnow().isoformat() + "Z",
            "event_type": "llm.completion",
            "user_id": user_id,
            "session_id": session_id,
            "model": model,
            "prompt_hash": self._hash_content(prompt),
            "response_hash": self._hash_content(response),
            "tokens_input": tokens_in,
            "tokens_output": tokens_out,
            "latency_ms": latency_ms,
            "rag_documents": rag_docs or [],
            "tools_called": tools_called or [],
            "guardrails_triggered": guardrails_triggered or [],
            "status": "success"
        }
        
        self.backend.write(event)
    
    def log_guardrail_block(self, user_id: str, rule: str, content_hash: str):
        """Log when guardrails block content"""
        event = {
            "timestamp": datetime.utcnow().isoformat() + "Z",
            "event_type": "guardrail.block",
            "user_id": user_id,
            "rule_triggered": rule,
            "content_hash": content_hash
        }
        self.backend.write(event)

Compliance Requirements

GDPR (EU)

Document data processing activities
Log consent and data subject requests
Retain logs for breach investigation

HIPAA (Healthcare)

Audit all PHI access
6-year log retention
Tamper-evident storage

SOC 2 / SOX

Access control logging
Change management audit trail
Regular log reviews

EU AI Act

Log AI system decisions
Model version tracking
Traceability for high-risk AI

Logging Tools & Infrastructure

Datadog Splunk Elastic (ELK) Grafana Loki Cloud Logging CloudWatch

Best Practices

Do This

Use structured logging (JSON)
Include trace IDs for correlation
Hash sensitive content before logging
Set appropriate retention policies
Use immutable/append-only storage
Implement log access controls

Avoid This

Logging raw prompts and responses
Storing logs without encryption
Allowing log deletion/modification
Ignoring log storage costs
Lacking log retention policies
Skipping log review processes