althor · AI infrastructure consulting

The guardrails, orchestration, and audit between a prompt and a production deployment.

Most AI work stops at the prompt. I build the infrastructure underneath — credential brokers, tool access layers, approval gating, audit trails, and the operational discipline that makes agents deployable in environments where getting it wrong costs real money. Consulting for enterprise teams that need more than a prompt library.

Multi-model extraction pipeline

agent framework · consensus voting · learning loops

Production document-extraction system built on a configurable multi-stage agent pipeline with field-level consensus voting across multiple extraction models. Deterministic validators, Raw → Suggested → Final audit layering so nothing is silently overwritten, and human-in-the-loop correction that feeds structured learning loops.

Outcome Eliminated a full-time contract position (~$60–70k annual). Manual throughput went from 4 entries per hour (15 min per scan) to 120 per hour (two per minute) — ~30× throughput improvement.

upload → render → segment → extract → consensus → validate → research → persist
                                        ↑                     ↑
                                   agreement score       address/format rules

data layering:   Raw (model outputs)  →  Suggested (consensus)  →  Final (human-confirmed)
                 └── no silent overwrites. every layer is auditable ──┘

Field-level agreement scoring across multiple extraction models
Deterministic validators — address verification, format rules
Raw → Suggested → Final layers prevent silent overwrites
Correction events feed few-shot examples for continuous improvement
Auto-approve gate with configurable confidence threshold
Per-document audit trail of every model run and decision

TypeScript Next.js Azure Functions Durable Orchestration Static Web Apps Azure SQL Blob Storage Entra ID Bicep IaC Key Vault RBAC Zod Playwright E2E Vitest Sentry (PII-scrubbed)

Role Architect & builder Domain Regulated nonprofit Status Production

Enterprise AI governance platform

use-case registry · RBAC · audit trail · Teams + SharePoint embed

AI-adoption governance system spanning submission, review, tool approval, and policy enforcement across a global workforce. Four-tier security-group RBAC, full audit trail for governance-council review, and native Microsoft Teams + SharePoint embedding so the tool meets users where they already work.

Users → Azure Static Web Apps → Azure Functions → Cosmos DB
                                    ↓
                           Azure Communication Services (notify)
                                    ↓
                                Azure Key Vault (secrets)

RBAC tiers:   End Users  →  AI Champions  →  AI Council  →  System Admins
             (submit)      (review)         (approve)        (operate)

Microsoft Entra ID SSO with security-group RBAC
Four-tier role model — end users, champions, council, admins
Native Teams app + SharePoint web-part embedding
Use-case submission workflow with council review gates
PR-preview deployments per branch via GitHub Actions
Key Vault secret references, no secrets in source

React 18 TypeScript Vite Tailwind CSS React Query MSAL.js Azure Functions Azure Static Web Apps Cosmos DB Entra ID Bicep IaC GitHub Actions Playwright

Scale ~600 employees · 20 countries Volume ~100 submissions / month Status Production

Spire

AI-native infrastructure control plane · Rust · single-binary deploy

Operator surface for running agents against real infrastructure without handing them master keys. Policy-gated approval flows for critical actions, per-object audit log, MCP-based tool access with read-only guards and per-server auth boundaries, staged remediation with confidence-based auto-apply thresholds and explicit override for high-risk operations.

Agent → MCP tool layer (scoped, read-only guards)
           ↓
  Policy check (is this a protected resource?)
           ↓
  Confidence gate (≥ threshold → auto-apply · < threshold → queue approval)
           ↓
  Staged remediation (dry-run → apply → verify → audit)
           ↓
       Audit event · reversible · reviewable

Policy-gated approval flow for critical services
Confidence-based auto-apply thresholds
MCP tool access with per-server scopes — no master keys
Per-object audit log covers every action + override
Read-only query guards reject anything outside allowlist
Three-second cancellable countdown on auto-applied actions

Rust Axum sqlx SQLite bollard tokio React Vite Tailwind Motion xyflow Model Context Protocol rust-embed

Role Sole designer & maintainer Shape Single-binary Docker image Status Active development

Making agent deployments pass security review

A five-layer architecture distilled from shipping three agent systems in regulated environments

Most agent projects hit the same wall. The demo looks great. The pilot works. Then it meets security review, and every layer of the system turns out to be wrong: one service account with god-mode, secrets in environment variables, no audit trail, no way to tell if the agent did something it shouldn’t have. The project either ships with a risk exception that auto-expires, or dies quietly on a compliance spreadsheet.

The pattern that actually ships is a five-layer separation of concerns. Each layer exists because it maps directly to a compliance control an enterprise InfoSec team will ask about. Skip any layer and the review fails.

Identity Every agent action is attributed to a specific actor — a machine account for autonomous runs, an on-behalf-of delegation for user-triggered actions. “The agent did it” is not an answer. Entra ID, Okta, Google Workspace all support workload identities; use them. The audit log has to answer who.

Credential broker Agents never hold long-lived secrets. They request short-lived, scoped tokens from a broker (Vault, Key Vault, cloud-native secret manager) on each call. The broker knows which tools the agent is authorized to call, mints a narrow token, and logs the request. Compromising the agent’s runtime doesn’t compromise the kingdom.

Scoped tool access Tools expose the smallest surface the agent needs. Read-only views of production databases. Action-specific API wrappers instead of admin SDKs. Model Context Protocol servers with per-server auth boundaries. When the agent goes off-rails — and sometimes it will — the blast radius is bounded by design, not by hope.

Policy + approval gating Every action goes through a policy check before execution. Safe, reversible operations (read a record, propose an update) auto-execute. Irreversible or high-risk operations (issue a refund, modify production config, delete data) queue for human approval. Confidence thresholds decide the line. Humans own the risky calls; the pattern is explicit about which calls those are.

Audit layer Every suggestion, every approval, every override, every failure produces a structured event. Logs are not enough — logs are for debugging. An audit layer is structured, queryable, retained, and reviewable. Compliance, incident response, and post-mortem all run against the same surface. This is the layer that turns a legal conversation into a technical one.

None of these layers are exotic. They are how authentication, secrets management, least-privilege access, change control, and audit logging have worked in enterprise infrastructure for decades. The novelty of agent systems is not that they require new controls — it is that they require the same controls applied to a new kind of actor. Treating the agent as a first-class identity with scoped access, reviewable actions, and complete audit is what moves a project from demo to deployment.

The five layers are the minimum. Everything else — orchestration frameworks, prompt engineering, model selection — is downstream of this foundation.

The guardrails, orchestration, and audit between a prompt and a production deployment.

Making agent deployments pass security review