// BLOG
Latest from CIRCUIT
Latest thinking on AI interpretability, governance, and enterprise security.
These posts were written during CIRCUIT's private development through early 2026 and are published here with their original authoring dates, alongside the public launch announcement.
Announcing CIRCUIT at FIRSTCON26: An Open Source Framework for AI Interpretability Risk Management
Introduced today at the FIRST Annual Conference 2026 in Denver — an open-source framework for AI interpretability risk management, built for the security community.
"When this AI system makes a bad decision — how will we know which part of the model made it?" That question, asked in a governance committee meeting, is why CIRCUIT exists. Today's AI controls treat every model as an opaque function from input to output — paperwork around the outside of a black box. CIRCUIT is the first public release: a Score, a Registry, and a Control, released under Apache 2.0, built to make interpretability an enforceable security control.
When the Picture Lies to the Prompt: CrossMPI and the Case for Circuit-Level Governance
A new class of image-only prompt injection achieves 66% black-box success. Here is why your current controls miss it and what CIRCUIT does about it.
A research team at Xidian University has published an attack — CrossMPI — that perturbs an image invisibly to humans and, without changing a single character of the user's text prompt, causes a vision-language model to execute the attacker's task. Average success rate across six production-style models: 66.36%. None of the five defenses the researchers tested fully eliminated it.
How to Adopt It
Ninety days to a working program.
The playbook: four phases from Foundation to Mature. The 29-question "Show Me Your Circuits" vendor questionnaire. Regulatory crosswalks covering EU AI Act, NIST AI RMF, ISO 42001, SR 11-7, SOC 2, MITRE ATLAS, and more. And the open release details.
What CIRCUIT Is
A score, a registry, and a control.
CIRCUIT is deliberately small. The whole thing fits on a poster. This part walks through what is inside, in the order a practitioner would encounter it: the Interpretability Maturity Score, the Circuit Risk Score formula, the six KPIs, the three model categories, the eight-section registry schema, the ten hard rules, and the dashboard.
You Cannot Defend What You Cannot Inspect
Introducing CIRCUIT: an open-source framework for AI interpretability governance.
We are deploying AI we can't explain, defending AI we can't inspect, and trusting AI we can't audit. That is not a governance program — that is a liability surface. Existing AI controls answer who owns the model, where it runs, and whether anyone approved it; they do not answer why the model did what it did. This post names the gap, introduces CIRCUIT — a Score, a Registry, and a Control released under Apache 2.0 — and explains why interpretability is now a security control, not a research curiosity.
AI Interpretability as a Security Control
You're being asked to deploy AI you can't explain, defend AI you can't inspect, and trust AI you can't audit. That is about to change.
Recent research on weight-sparse transformers offers the first serious path toward AI systems with inspectable internal logic. This is not theoretical — it is a concrete methodology that produces models where we can point to the exact circuit implementing a specific decision. For security leaders navigating the "use AI everywhere" mandate while remaining accountable when things go wrong, this research deserves your attention.
Contribute or follow along on GitHub
github.com/jumpmindinc/circuit-framework →