// design decisions

Architectural
transparency.

Credibility requires honesty about what a system does and does not do. These are deliberate design decisions, not gaps.

Fail-closed default

Decision: The proxy blocks tool calls when OPA is unreachable.

Why: For governed agents, security takes precedence over availability. A missed governance evaluation is worse than a paused agent.

Implication: An OPA outage stops all governed agent activity until OPA recovers.

Mitigation: OPA health monitoring with sub-second alerts. Per-tenant override to fail-open for non-critical agent classes. OPA sidecar architecture minimizes network dependency.

Inline proxy latency

Decision: Every tool call passes through the proxy before execution.

Why: Pre-execution enforcement is the architectural foundation. Post-hoc detection cannot prevent actions.

Implication: Adds 5-30ms per tool call. OPA evaluation is sub-millisecond; the bulk is the network hop.

Mitigation: Connection pooling, local OPA sidecar, and async audit logging minimize overhead. For context, a typical LLM inference call takes 500ms-5s. The governance layer is negligible relative to agent processing time.

stdio backend limitations

Decision: Support both Streamable HTTP and stdio MCP transports.

Why: Many local development tools (filesystem, shell) use stdio. Excluding them would leave a governance gap.

Implication: stdio MCP backends are process-local and cannot be load-balanced or health-checked like HTTP backends.

Mitigation: HTTP backends recommended for production workloads. stdio supported for development and edge cases where local process binding is required.

Hash chain is append-only, not a distributed ledger

Decision: SHA-256 chain with previous_hash linkage, not blockchain or distributed consensus.

Why: Distributed consensus adds latency with no governance benefit for single-tenant audit chains. The threat model is tampering, not Byzantine fault tolerance.

Implication: Chain integrity depends on database access controls. A compromised database could theoretically rewrite history.

Mitigation: Chain verification endpoint validates integrity on demand. Nightly background verification task. SIEM export provides an independent backup outside Behavry's control. External audit anchoring (S3 WORM bucket) on roadmap.

Behavioral baselines require warm-up

Decision: Anomaly detection compares against rolling behavioral baselines that accumulate over time.

Why: Static thresholds produce false positives. Behavioral baselines adapt to each agent's actual usage pattern.

Implication: New agents have no baseline for the first N hours. Anomaly detection is less effective during this period.

Mitigation: Configurable warm-up period. Manual baseline seeding for known agent profiles. Conservative alerting thresholds during warm-up. Policy-based enforcement (allow/deny/intercept) works from the first tool call regardless of baseline status.

DLP uses pattern matching, not semantic analysis

Decision: 26 regex patterns across 7 categories. Deterministic, zero-latency, no external API dependency.

Why: Pattern matching is fast, auditable, and does not introduce a dependency on an external ML service in the critical path.

Implication: Sophisticated exfiltration that avoids pattern signatures may not be caught by DLP alone.

Mitigation: Inbound rules engine adds semantic matching for tool responses. Cross-session fragment reassembly detects credential splitting across sequential requests. Community Policy Library extends pattern coverage through collective intelligence. Behavioral anomaly detection catches volume-based exfiltration that individual pattern matches miss.

Single-region deployment (current)

Decision: Single AWS EC2 with Docker Compose for the current deployment.

Why: Seed-stage operational simplicity. One region, one stack, fast iteration.

Implication: No geographic redundancy. Single point of failure at the infrastructure level.

Mitigation: Docker Compose architecture is portable to any cloud or on-prem environment. Multi-region and multi-cloud deployment on roadmap. BYOC and Self-Hosted deployment models allow customers to run the stack in their own redundant infrastructure today.

Context Gate defaults to hidden for unclassified tools

Decision: Unrecognized tools from newly connected MCP servers are hidden by default until an admin classifies them.

Why: Least privilege for cognition. An agent should not access tools that have not been evaluated and classified.

Implication: New tools from newly connected MCP servers are invisible to agents until an admin approves them.

Mitigation: Admin notification on new tool detection. One-click approve in the dashboard. Bulk classification for servers with many tools. Auto-classification rules based on tool naming patterns.