althor Consulting
Production infrastructure for AI agent systems

The guardrails, orchestration, and audit between a prompt and a production deployment.

Most AI work stops at the prompt. I build the infrastructure underneath — credential brokers, tool access layers, approval gating, audit trails, and the operational discipline that makes agents deployable in environments where getting it wrong costs real money. Consulting for enterprise teams that need more than a prompt library.

Multi-model extraction pipeline

agent framework · consensus voting · learning loops

Production document-extraction system built on a configurable multi-stage agent pipeline with field-level consensus voting across multiple extraction models. Deterministic validators, Raw → Suggested → Final audit layering so nothing is silently overwritten, and human-in-the-loop correction that feeds structured learning loops.

Outcome Eliminated a full-time contract position (~$60–70k annual). Manual throughput went from 4 entries per hour (15 min per scan) to 120 per hour (two per minute) — ~30× throughput improvement.
upload → render → segment → extract → consensus → validate → research → persist
                                        ↑                     ↑
                                   agreement score       address/format rules

data layering:   Raw (model outputs)Suggested (consensus)Final (human-confirmed)
                 └── no silent overwrites. every layer is auditable ──┘
  • Field-level agreement scoring across multiple extraction models
  • Deterministic validators — address verification, format rules
  • Raw → Suggested → Final layers prevent silent overwrites
  • Correction events feed few-shot examples for continuous improvement
  • Auto-approve gate with configurable confidence threshold
  • Per-document audit trail of every model run and decision
TypeScript Next.js Azure Functions Durable Orchestration Static Web Apps Azure SQL Blob Storage Entra ID Bicep IaC Key Vault RBAC Zod Playwright E2E Vitest Sentry (PII-scrubbed)
Role Architect & builder Domain Regulated nonprofit Status Production

Enterprise AI governance platform

use-case registry · RBAC · audit trail · Teams + SharePoint embed

AI-adoption governance system spanning submission, review, tool approval, and policy enforcement across a global workforce. Four-tier security-group RBAC, full audit trail for governance-council review, and native Microsoft Teams + SharePoint embedding so the tool meets users where they already work.

UsersAzure Static Web AppsAzure FunctionsCosmos DBAzure Communication Services (notify)Azure Key Vault (secrets)

RBAC tiers:   End Users  →  AI Champions  →  AI Council  →  System Admins
             (submit)      (review)         (approve)        (operate)
  • Microsoft Entra ID SSO with security-group RBAC
  • Four-tier role model — end users, champions, council, admins
  • Native Teams app + SharePoint web-part embedding
  • Use-case submission workflow with council review gates
  • PR-preview deployments per branch via GitHub Actions
  • Key Vault secret references, no secrets in source
React 18 TypeScript Vite Tailwind CSS React Query MSAL.js Azure Functions Azure Static Web Apps Cosmos DB Entra ID Bicep IaC GitHub Actions Playwright
Scale ~600 employees · 20 countries Volume ~100 submissions / month Status Production

Spire

AI-native infrastructure control plane · Rust · single-binary deploy

Operator surface for running agents against real infrastructure without handing them master keys. Policy-gated approval flows for critical actions, per-object audit log, MCP-based tool access with read-only guards and per-server auth boundaries, staged remediation with confidence-based auto-apply thresholds and explicit override for high-risk operations.

AgentMCP tool layer (scoped, read-only guards)Policy check (is this a protected resource?)Confidence gate (≥ threshold → auto-apply · < threshold → queue approval)Staged remediation (dry-run → apply → verify → audit)Audit event · reversible · reviewable
  • Policy-gated approval flow for critical services
  • Confidence-based auto-apply thresholds
  • MCP tool access with per-server scopes — no master keys
  • Per-object audit log covers every action + override
  • Read-only query guards reject anything outside allowlist
  • Three-second cancellable countdown on auto-applied actions
Rust Axum sqlx SQLite bollard tokio React Vite Tailwind Motion xyflow Model Context Protocol rust-embed
Role Sole designer & maintainer Shape Single-binary Docker image Status Active development

Making agent deployments pass security review

A five-layer architecture distilled from shipping three agent systems in regulated environments

Most agent projects hit the same wall. The demo looks great. The pilot works. Then it meets security review, and every layer of the system turns out to be wrong: one service account with god-mode, secrets in environment variables, no audit trail, no way to tell if the agent did something it shouldn’t have. The project either ships with a risk exception that auto-expires, or dies quietly on a compliance spreadsheet.

The pattern that actually ships is a five-layer separation of concerns. Each layer exists because it maps directly to a compliance control an enterprise InfoSec team will ask about. Skip any layer and the review fails.

  1. Identity Every agent action is attributed to a specific actor — a machine account for autonomous runs, an on-behalf-of delegation for user-triggered actions. “The agent did it” is not an answer. Entra ID, Okta, Google Workspace all support workload identities; use them. The audit log has to answer who.
  2. Credential broker Agents never hold long-lived secrets. They request short-lived, scoped tokens from a broker (Vault, Key Vault, cloud-native secret manager) on each call. The broker knows which tools the agent is authorized to call, mints a narrow token, and logs the request. Compromising the agent’s runtime doesn’t compromise the kingdom.
  3. Scoped tool access Tools expose the smallest surface the agent needs. Read-only views of production databases. Action-specific API wrappers instead of admin SDKs. Model Context Protocol servers with per-server auth boundaries. When the agent goes off-rails — and sometimes it will — the blast radius is bounded by design, not by hope.
  4. Policy + approval gating Every action goes through a policy check before execution. Safe, reversible operations (read a record, propose an update) auto-execute. Irreversible or high-risk operations (issue a refund, modify production config, delete data) queue for human approval. Confidence thresholds decide the line. Humans own the risky calls; the pattern is explicit about which calls those are.
  5. Audit layer Every suggestion, every approval, every override, every failure produces a structured event. Logs are not enough — logs are for debugging. An audit layer is structured, queryable, retained, and reviewable. Compliance, incident response, and post-mortem all run against the same surface. This is the layer that turns a legal conversation into a technical one.

None of these layers are exotic. They are how authentication, secrets management, least-privilege access, change control, and audit logging have worked in enterprise infrastructure for decades. The novelty of agent systems is not that they require new controls — it is that they require the same controls applied to a new kind of actor. Treating the agent as a first-class identity with scoped access, reviewable actions, and complete audit is what moves a project from demo to deployment.

The five layers are the minimum. Everything else — orchestration frameworks, prompt engineering, model selection — is downstream of this foundation.

Architecture before code

Auth boundaries, tool scopes, audit surface, and failure modes get modeled before the first function is written. Agent systems that skip this get rewritten later under worse conditions.

Scoped tools, never master keys

Agents get short-lived, scoped credentials per call. No service accounts with god-mode access hiding in a prompt. Credentials live in a broker; tools expose narrow surface areas with explicit allowlists.

Audit everything that matters

Every decision, every suggestion, every override produces a structured event. Compliance review, debugging, and post-incident work all run against the same audit surface — not reconstructed from logs after the fact.

Humans own the risky calls

Confidence thresholds decide what auto-runs and what queues for approval. Safe, reversible actions execute immediately. Anything touching production state, payments, or customer-facing data waits for a human.

Salesforce and CRM workflow automation with agent handlers
Agent orchestration on existing enterprise stacks
Production ops for internal AI platforms — LiteLLM, Langfuse, vLLM
MCP server design and deployment at team scale
Security and audit posture review for agent-driven workflows
Fixed-scope engagements; advisory retainers on ongoing work
Email
contact@althor.dev
LinkedIn
Samuel S. · linkedin.com
Response
Within two business days
Based
Maryland, United States — remote-first