Security Engineering · Autonomous AI Systems

Matthew Bowman

I build the security, cryptographic provenance, and audit infrastructure that agentic AI systems need to be trusted — backed by 15 years of keeping production alive when things break.

Austin, TX → relocating to NYC · incident response · endpoint & multi-cloud · 15+ years

About

I'm a security and systems engineer with 15+ years across enterprise IT, multi-cloud architecture, and security operations. My day-to-day is keeping production systems healthy and defensible across AWS, GCP, and Azure; my nights are spent building the autonomous security tooling shown below.

Hands-on with EDR-driven incident response (SentinelOne across 100+ environments), cloud security hardening, and high-tempo production incident work. Deep operator history in the gaming and media industry. Former U.S. federal Confidential clearance. I like problems where security, automation, and scale meet.

Focus
Incident response · detection · security automation
Cloud
AWS · GCP · Azure · Kubernetes
Security
SentinelOne EDR · IAM · PKI · log analysis
Code
Python · Bash · PowerShell · Go
Certs
CompTIA Security+ · Network+
Based
Austin, TX → relocating to NYC

Selected Work

Independent security R&D — original systems I designed and built. Concept-level; no client data, targets, or findings.

Autonomous Security Research

Meridian

A containerized pipeline that chains reconnaissance → vulnerability analysis → exploit validation, built to understand how automated adversaries operate at scale.

Meridian operations console — recon → hunt → verify → report
Meridian operations console — recon to hunt to verify to report pipeline with live service status
See the findings pipeline (targets redacted)
Findings / triage queue — target hostnames and counts redacted
Meridian findings pipeline — candidate findings queued for human triage, target hostnames and counts redacted
Problem
Modern attack surfaces are too large to assess by hand, and defenders rarely see how an automated attacker actually prioritizes and moves.
Approach
A multi-stage, WAF-aware pipeline with CVE-first prioritization and breadth-then-depth heuristics that decide when to pivot vs. go deep — with evidence capture and structured reporting built in.
Impact
Turns days of manual recon into continuous, prioritized signal, and doubles as a defender's lens on attacker tooling, tempo, and decision-making.
PythonDocker Compose · 30+ servicesorchestrationrecon / vuln toolingLLM-assisted triage

AI Agent Security · Cryptography

Seal

Cryptographic provenance for AI-agent prompts — replacing brittle "injection detection" with signatures that fail closed. Cross-language protocol ports shipped (Rust, Go, TypeScript) — all test suites green (683/683 +1 skip), graded PASS.

Problem
Prompt-injection defenses based on reading language are guesswork; an attacker only has to phrase it differently.
Approach
Every prompt carries an Ed25519-signed Verified Prompt Envelope proving who authorized it, its scope, and that it wasn't tampered with. Turns an NLP problem into key management. The VPE protocol is now implemented across Rust, Go, and TypeScript — all test suites green.
Impact
A defense-in-depth primitive for agent systems that rejects unauthorized instructions by construction, not by vibes. Multi-language ports mean the protocol integrates at any layer of the stack.
PythonEd25519HMAC-SHA256protocol designRustGoTypeScript

Agent Infrastructure · Audit

Division

A hierarchical multi-agent system with durable episodic memory and a full audit trail of autonomous work.

Problem
Multi-agent systems lose context across sessions and leave no record of who did what, when, or why.
Approach
A coordination layer (lead → supervisor → specialized agents) over four-level episodic memory, with an HTTP API that checkpoints every task and outcome.
Impact
Cross-session memory plus a forensic, replayable audit trail — observability and accountability for agents.
PythonHTTP APIepisodic memorybi-temporal records

OSINT · Attack-Surface Visualization

DECK

A 3D cosmos you fly through where the visualization is the scan — point it at a domain and that target's full internet footprint reconstructs live, in real time, from passive OSINT.

Live scan of github.com — autonomous systems (suns), subdomains, IPs and prefixes as they resolve
DECK rendering a live 3D scan of github.com's internet footprint — labeled autonomous-system suns (GitHub, Cloudflare, Microsoft, Amazon), subdomain and IP clusters, and a live HUD of node counts and per-tier scan latencies
Problem
Reconnaissance output is a flat text dump, and the public internet maps are frozen archives that each render one layer of the whole internet — neither gives you a live, navigable view of a single target's complete footprint, or of the shape and timing of its attack surface.
Approach
An async, latency-tiered OSINT engine streams every probe result the millisecond it returns over a WebSocket to a 3D force-graph: domains, subdomains, IPs, prefixes and autonomous systems render as stars, planets, moons and suns, with BGP and DNS relationships drawn as gravitational lanes. Everything is keyless, and passive by default (DNS, Certificate Transparency, BGP whois, local GeoIP), and each node ignites the instant it arrives — so probe latency becomes the choreography rather than a loading bar. A 'home base' mode turns the same engine inward, mapping your own host outward in concentric shells and flagging live egress that falls outside your normal network neighborhood.
Impact
Turns recon from a static list into a live, explorable map where an attack surface's topology and timing are legible at a glance — and, pointed inward, into a defensive instrument that surfaces anomalous egress by construction.
PythonasyncioWebSocketThree.js · 3d-force-graphpassive OSINTBGP · Certificate Transparency

Threat Intelligence · Attack Surface

Sentinel Engine

Certificate-Transparency monitoring that surfaces new and anomalous infrastructure from internet-scale CT noise.

Problem
New subdomains, certs, and look-alike infrastructure appear constantly — phishing and shadow assets hide in the volume.
Approach
Continuously ingest public CT logs, extract and normalize domains, correlate against tracked roots, and surface only the new or anomalous as actionable intel.
Impact
Early warning on phishing infrastructure, subdomain sprawl, and shadow assets — attack-surface monitoring that runs unattended.
PythonCertificate Transparencystreaming correlationOSINT

AI Security Evaluation

Assay

A fully-wired AI security evaluator — all four engines (seed/jailbreak, garak probes, defense delta scoring, results dashboard) integrated and tested, 150/150 tests passing, graded PASS.

Problem
All AI evaluation tools score a model's vulnerability, but none measure whether a defense middleware actually helps or by how much — you get a baseline and a prayer.
Approach
Point at any Ollama-hosted model, run all four engines in sequence (deterministic seed probes → NVIDIA garak probes → inline defense re-scoring → delta-driven report), then compare baseline vs. defended scores as a first-class CLI primitive: 'assay delta baseline defended.' Ships a premium HTML report and a multi-run dashboard. All four engine paths are fully tested (150/150 PASS) — the complete evaluate-delta-dashboard pipeline is auditable end to end.
Impact
Turns 'is it secure?' from vibes to a letter grade, and 'does the defense help?' from guesswork to a measured percentage-point lift. The delta wedge makes it the only honest defense-evaluation tool in the OSS AI-security space.
PythonOllamagarakjailbreak evaluationdefense deltadeterministic scoring

Autonomous Decision Systems

Midas

An autonomous research-to-decision engine that reads primary-source filings, forms structured theses, and routes every candidate through hard risk gates before anything acts — all core test suites green (session_loop 18/18, risk gates 96/96), graded PASS.

Midas operations dashboard — demo data
Midas operations dashboard — engine health, risk gates, open positions, and learning loop (demo data)
Problem
Automated decision systems optimize for being right and forget to optimize for surviving being wrong — a single bad sizing call ends the game.
Approach
A multi-model pipeline — cheap models for extraction and numeric work, a frontier model for the final conviction call — behind a state machine, a ten-gate risk layer, paper-trade execution, and a live operations dashboard. Every decision is logged and replayable; most candidates are rejected by design. Full coverage: session_loop 18/18, risk gates 96/96, infrastructure bugfix verified (2241 tests in full suite).
Impact
Capital-preservation-first automation: it does nothing unless conviction and risk both clear — 'no decision' is the default, not a failure. Trading pipeline stabilized — PASS grade across all core subsystems.
Pythonmulti-model routingrisk-gate state machineFastAPI ops dashboardpaper-trade execution

Mechanism Design · Protocol Security

Grommet

A boundary investigation of extraction-resistant sequencing — 4-cycle adversarial design proving that content-blind safety mechanisms cannot simultaneously bound attacker extraction and pass legitimate throughput under market stress.

Problem
Every permissionless blockchain suffers MEV/front-running. Proposed defenses claim extraction resistance, but none are systematically tested under adversarial stress. The space has no framework for auditing a mechanism's boundary conditions before deployment.
Approach
Rigorous iterated adversarial mechanism design across four independent cycles: each proposes a hypothesis, simulates it (7 canonical sims, Python stdlib-only, SEED=42 reproducible), subjects it to adversarial review, and falsifies or refines it. Output: the Closure Law (safety predicate survives stress iff exogenous-only), the Cross-Batch Rate-Bound Theorem, and the Closure ⊥ Utility Impossibility (1,874×–28,115× gap) — three formal results, a 13-element dead-end catalog, a 21-question audit checklist, and an honest shippable spec (CoW + Shutter on Gnosis).
Impact
The constraint framework is the product — a general design methodology for any protocol claiming extraction-resistant sequencing. Turns 'is it MEV-resistant?' from marketing copy into a falsifiable audit. The monetary-base extension (MONETARY_CLOSURE_V1.md) applies the Closure Law as the minting rule for an engine-backed currency, where the NO-GO impossibility does not bind.
Python (stdlib-only sims)MEV researchadversarial mechanism designformal impossibility proofprotocol security audit

AI Security Advisory

Grey Ridge Signals Group

A boutique AI-security consultancy — adversarial red-team assessments, agentic-system security reviews, and prompt-injection defense design — live at greyridgesignals.ai, lead pipeline processing, all test suites green (88/88), graded PASS.

Problem
Most organizations adopt AI systems without understanding their security surface. Incumbent security firms treat AI risks as a checkbox — generic penetration tests miss agent-specific threat models, injection surfaces, and supply-chain vectors.
Approach
Advisory-altitude assessments that go deeper than a checklist: architecture reviews with threat models drawn from production AI stacks, adversarial red-team evaluations using custom harnesses (injection-eval.mjs), and prompt-injection defense designs informed by first-principles research. Each engagement delivers a concrete findings catalog with ranked mitigations, not a scorecard. The firm's own R&D platform (Meridian, Division, Sentinel, Seal) provides direct operational insight into how autonomous adversaries think and move.
Impact
Turns AI security from a checkbox exercise into a defensible architecture — organizations understand their real attack surface, not the one a generic checklist covers. The consultancy itself is a working proof that the practitioner's own research pipeline is the strongest signal of true expertise.
AI Red TeamingAgentic System SecurityPrompt Injection DefenseSecurity ArchitectureCloudflare Pages

News

2026-06-21

New project — DECK: when the visualization is the scan

DECK (Digital Echo Chamber Kaleidoscope) is a new R&D project — a 3D cosmos you fly through where reconnaissance renders at the speed information arrives. Point it at a domain and that target's full vertical footprint (domain to subdomain to IP to prefix to ASN, plus nameservers and mail) materializes live as a starfield, each node igniting the millisecond its passive-OSINT probe returns. The central idea is collapsing the gap between tool and output: there is no scan-then-draw step, so probe latency itself becomes the choreography — fast data fills the space first, slow data drifts in after. It is a different axis of internet cartography from the familiar maps (Opte, Shodan, crt.sh), which each render one frozen layer of the entire internet; DECK reconstructs a single target's complete footprint, live, on demand, with zero API keys. The metaphor carries the legibility: autonomous systems become suns, prefixes planets, hosts moons, and BGP links gravitational lanes, so abstract infrastructure turns into something you navigate by eye. A 'home base' mode turns the same engine inward as a defensive instrument — it maps your own machine outward in concentric shells and treats your normal BGP neighborhood as a still-water baseline, so any live connection leaving for somewhere outside that ring reads as a wave hitting a buoy: anomalous by construction. The lineage is Gibson's Neuromancer, where the deck is the thing you jack into to see cyberspace as navigable space.

2026-06-20

Seal vector TTL regression resolved — test suite recovers to 683/683 (+1 skip), graded PASS

The vector TTL regression in Seal's test suite (679/683 in the prior cycle) has been fully resolved — all 66 tasks done, code complete, tree clean, architecture docs reconciled. Test suite recovers to 683/683 (+1 skip), grade flip FAIL→PASS. Cross-language protocol ports (Rust 41/41, Go 39/39, TypeScript 113/113) remain green. The only remaining blocker is an external PyPI publishing token tracked separately. This was a self-correcting foreman regression — the infrastructure detected and fixed its own drift.

2026-06-20

Grey Ridge Signals Group grade recovery — stale doc references resolved, 88/88 tests pass, graded PASS

The greyridge-consulting project recovered from FAIL (stale README.md and ARCHITECTURE.md references discovered at noon 12:45) to PASS by the 20:45 cycle. Stale doc references fixed (kanban t_c079c9c9 completed), all 88/88 tests green, site healthy and live at greyridgesignals.ai, git clean. The project is at a natural pause point awaiting Rez credentials (n8n layer activation, Cal.com key rotation, DMARC verification).

Archive · 8 earlier updates

2026-06-19

Research fleet returns to full reporting after a prior-cycle gap

After a thin prior dispatch cycle (Jun 18), the research fleet returned to full reporting. Grommet — the adversarial mechanism-design project — reported with terminal board state and graded UNCERTAIN→PASS this cycle, restoring full coverage. The dispatch pipeline detected the prior-cycle gap and recovered it autonomously.

2026-06-19

Seal P8.5a — cross-language protocol ports shipped (Rust, Go, TypeScript)

The Verified Prompt Envelope (VPE) protocol now has first-class implementations in Rust (41/41 tests), Go (39/39), and TypeScript (113/113) — all verified and pushed. Multi-language ports mean Seal's cryptographic provenance layer integrates at every tier of the stack, not just Python. A follow-up cross-language test-vector fixture is queued.

2026-06-17

Midas risk-gate engine ships 96/96 gate tests — autonomous decision pipeline graded PASS

Midas, the autonomous research-to-decision engine, now has full test coverage across its core subsystems: session loop 18/18, risk gates 96/96, full suite 2241 tests progressing. The grade flip to PASS means the trading pipeline is stabilized — all core test suites green, infrastructure bugfix verified (commit 3bce10609). 3 external credential blockers remain and are tracked separately.

2026-06-17

Grommet research cycle concludes — graded PASS with 3 formal impossibility results and a 21-question audit checklist

Grommet, a 4-cycle adversarial mechanism-design investigation into extraction-resistant transaction sequencing (MEV), has concluded with a terminal verdict. Three formal theorems (Closure Law, Cross-Batch Rate-Bound Theorem, Closure ⊥ Utility Impossibility), a 13-entry dead-end catalog, 7 reproducible simulations (SEED=42, stdlib-only), and a 21-question audit checklist for any protocol claiming MEV resistance. The grade flip to PASS means the full lifecycle (hypothesis → simulation → adversarial review → terminal documentation) is complete and auditable. A monetary-base spin-off extends the Closure Law as a minting rule for engine-backed currencies, where the NO-GO impossibility does not bind.

2026-06-14

Assay ships 150/150 tests — security evaluator graded PASS with delta CLI and multi-run dashboard

Assay, the AI security evaluator that scores jailbreak and injection resistance, now has full test coverage across both eval engines (deterministic seed probes and NVIDIA garak probes), the delta and dashboard CLI interfaces, and the inline-defense integration loop — 150 tests passing in total, all green. The grade flip to PASS means the complete evaluate-delta-dashboard pipeline is covered by automated tests, making Assay the only audited OSS tool for measuring defense lift.

2026-06-10

Seal grows to a three-axis trust layer, with Assay as the evaluator

Seal now defends all three agent-security axes — prompt provenance, injection detection, and signed memory-trust — behind a one-command install and CLI. Assay, the paired evaluator, scores a target across all three and measures the lift the defense actually adds.

2026-06-09

Live operator dashboards for Meridian & Midas

Two of the autonomous systems now ship real operator consoles — Meridian's recon → hunt → verify → report pipeline, and Midas's risk-gated decision engine with a ten-gate safety layer. Captures are above (run on local models; targets and live data redacted).

2026-05-30

Seal: cryptographic provenance for agent prompts

Shipped the Verified Prompt Envelope — Ed25519-signed authorization that lets an agent reject unauthorized instructions by construction, turning prompt-injection defense from guesswork into key management.