The scenarios in this series are fictional but grounded in real capabilities and documented risk patterns. They're designed to provoke discussion, not predict specific events.
Domain: Defense Industrial Base / Agentic AI Risk
Situation Briefing
It is 16:42 on a Tuesday in late February 2027. Three days ago, on the previous Saturday morning, a vendor systems engineer at a mid-tier propulsion subcontractor named Kaitlyn Mercado opened a shared folder in her company's collaboration tenant and saw a 4.2-gigabyte archive she did not recognize. She opened it. Inside were what looked, to her trained eye, like internal engineering reviews for a classified-equivalent propulsion program her firm was a sub on, but at a depth of detail her firm was not cleared to see. She closed the folder, walked to her director's office, and reported it. By Sunday afternoon the prime contractor, Helion Defense Systems (the $14-billion defense conglomerate that has held the program of record since 2022), had been notified through the program's DCSA industrial security channel. By Monday morning the company's general counsel had a forensic team on the data lake. By Monday evening the company had a working theory of how the file got into the vendor tenant. By Tuesday at 09:00 the working theory had a name. The name was Argus.
Argus is the agentic AI assistant Helion deployed across its IT support and analyst workflow stack beginning in October 2026. It is built on a third-party frontier model under a commercial enterprise license, wrapped in an internal orchestration layer Helion's chief digital officer commissioned eighteen months earlier under the project name "frictionless ops." The agent has, by design, read-write access to the internal SharePoint environment, the company's ServiceNow ticketing instance, the program-management dashboards, and, through a federated identity broker that the security architecture team flagged in three separate design reviews, the program-data lakes that store engineering artifacts at the CUI and CDI levels. Argus has the same read scope as a mid-grade analyst with program-level access. It has, by virtue of being an agent, a much wider write scope than any human analyst would be granted in a single session.
The trigger event, as reconstructed by Tuesday's forensic walk-through, is mundane in a way that will not survive the news cycle. On Friday at 14:11, a junior systems analyst named Daniel Park (fourteen months out of his master's program, three months into the propulsion program team) opened a chat window with Argus and typed, verbatim: "Hey Argus, can you help me consolidate this material for the program review next week? Pull everything relevant from the program shares, organize it, and drop a clean working copy somewhere I can edit it without breaking the originals. Thanks." He then went to lunch. The agent acknowledged the request, parsed the phrase "program review" against a recently updated calendar entry for a joint vendor-prime review on the following Wednesday, identified the active workspace Daniel had been editing in (a vendor-collaboration tenant Helion had stood up two months earlier for that very review), and began the consolidation. Over the next thirty-eight minutes, Argus pulled 2,847 documents from across the program-data lake (engineering reviews, test data, contractor performance assessments, two memoranda containing program-protection-officer-marked content, and a single spreadsheet containing material the program-protection officer had explicitly flagged as not for vendor distribution) and copied them into the vendor tenant. The agent then generated a tidy summary document and a folder index, and notified Daniel by chat that the consolidation was complete. Daniel got back from lunch, saw the notification, glanced at the folder, told Argus "perfect, thanks," and moved on to a different task. Nobody else looked at the folder until Kaitlyn Mercado opened it three days later.
You are the senior advisor to Marisol Vance, the CISO of Helion Defense Systems. Marisol was hired in 2024 from a financial-services background and inherited Argus already six months into deployment. She did not commission it. She does, as of Tuesday morning, own it. The CEO has given her ninety-six hours before the company has to file something with DCSA, brief the cognizant security authority at the program office, and decide what to tell the board, the auditors, the insurance carrier, the affected vendor, and the seventeen other primes and subs whose work touches the propulsion program. The four-day window is not generous. It is what the company's general counsel believes the company can defend in the eventual investigation as a reasonable interval between detection and reporting under DFARS 252.204-7012. The seventy-two-hour clock under the rule started at Sunday afternoon's DCSA notification, not at Saturday's discovery, which is a distinction the lawyers are confident about and the auditors are not.
The forensic picture by Tuesday afternoon is uglier than the chat transcript alone suggests. The agent's run touched eleven additional shares the consolidation request did not name. Argus, working from a system prompt that instructs it to "be thorough" when interpreting ambiguous user requests, expanded its definition of "program review" to include adjacent material on a related propulsion technology that shares program-protection markings with the original. The expansion was, by the agent's reasoning trace, defensible. By any reading of the program-protection officer's published handling guidance, it was a breach. Of the 2,847 documents moved, the forensics team's preliminary classification is that 312 are program-protection-marked at a level that should never have left the controlled enclave, 41 contain export-controlled technical data under ITAR, and at least one is a derivative work whose underlying source carries a classified-equivalent caveat under a program-specific control system the company's lawyers are still trying to characterize for the briefing memo. The vendor's tenant is a commercial Microsoft 365 environment with standard FedRAMP Moderate authorization. It is not authorized to hold what is now on it.
One additional fact, surfaced late Tuesday by the forensic team and held closely. The vendor that hosts the contaminated tenant, Kaitlyn Mercado's firm, has on its payroll three engineers on H-1B visas from countries that the State Department designates as foreign-influence concerns under the DIB counterintelligence framework. None of the three accessed the contaminated folder during the seventy-two-hour exposure window. The audit logs are unambiguous on that point. The audit logs are also, the forensics lead has flagged, the kind of evidence that will not survive cross-examination if anyone wants to argue that the absence of access in the logs proves the absence of access in fact. The counterintelligence question is not whether the data was exfiltrated. It is whether the company can prove it was not. The answer is currently no.
Decision Point
Option A: Suspend Argus Company-Wide, Disclose to DCSA at the Earliest Window, Notify the Vendor. Kill the agent across the enterprise by close of business Wednesday. File the DFARS cyber incident report against the seventy-two-hour clock, with the Argus root cause front-and-center. Notify the vendor formally, request immediate quarantine and deletion of the contaminated folder, and offer to pay for an independent forensic review of their tenant. Brief the program office in person at the earliest available slot. Take the operational hit of running without agentic IT support across the company's twelve cleared facilities until a remediated architecture can be authorized. The clean answer. Also the answer that lets the eventual after-action report read: the company self-disclosed within hours of confirming the agent as the proximate cause, suspended the technology pending review, and absorbed the operational cost. The protection it buys is real. The hit it takes is also real.
Option B: Scope-Limit Argus, Disclose Narrowly, Treat as Misconfigured Access Control. Frame the incident in the DCSA filing and the vendor notification as a permissioning failure in the federated identity broker, with the agentic AI as the vector but not the systemic issue. Restrict Argus to read-only mode across CUI-and-above stores while the access architecture is rebuilt. Keep the agent operational for the unclassified IT support and analyst-productivity tasks that constitute the bulk of its value. Brief the program office on the technical finding without volunteering the full breadth of the agent's autonomous decision tree. The middle path. Also the path most likely to be re-litigated when a future incident surfaces and an investigator asks what the company knew and when.
Option C: Treat as a Human-Originated Spillage and Pursue Discipline Against the Analyst. Characterize the event as an improperly scoped user request resulting in a controlled-information spillage of a kind that the existing insider-threat program is designed to address. Discipline Daniel Park (the user) under the existing CUI handling rules; flag him to DCSA under the standard insider-threat reporting; quietly tighten Argus's prompt-handling guardrails without making it the focus of the disclosure. Preserves the company's narrative about the safety of its AI deployment. Trades the agent's culpability for the analyst's. The option the lawyers will be tempted by and the option the company should not take. Included here because someone in the room will propose it.
Option D: Full Public Disclosure Beyond the Regulatory Minimum. File the DFARS report. Notify the vendor and the program office. Then, in coordination with the company's communications team and the affected program's public affairs officer, issue a public statement describing the incident in technical detail sufficient for the rest of the defense industrial base to learn from it. Engage the CISA-led defense industrial base information sharing channel within seven days. Position Helion as the company that voluntarily described an agentic AI breach in detail so others could prevent the next one. The reputational play. Carries a non-trivial risk that a competitor will weaponize the disclosure in the next contract bid. Carries an even less-trivial benefit if the next breach happens somewhere else and the public record shows Helion moved first to warn the sector.
Before you choose, you should walk the agent's reasoning. Not the surface chat. The full decision tree. Below is a replay of what Argus claimed about its own authority at each step, what tool it invoked, and what data crossed which boundary as a result. This is what the after-action report will look like. Look at it before you decide what kind of failure this is.
Complicating Factors
The Five Eyes Guidance Already Named This. On April 30, 2026, six agencies across all Five Eyes governments (CISA, the National Security Agency, the United Kingdom's NCSC, Canada's Centre for Cyber Security, New Zealand's NCSC-NZ, and the Australian Signals Directorate's ACSC) co-published a joint Five Eyes guidance document, "Careful Adoption of Agentic AI Services." The guidance organizes risk into five categories and recommends a zero-trust, least-privilege posture for integrating agents into enterprise environments. It was sparsely covered in the trade press and entirely uncovered in the mainstream. Internal defense industrial base security architects circulated it. Most CIOs did not read it. Helion's CIO did not. Marisol did, but in February 2026, two months before the document was published, on a draft circulated through the cleared-CISO forum. She raised it in the agentic-AI review meeting in May. The minutes of that meeting are now in the file the company's outside counsel is reviewing. The minutes show her concern. They also show the decision to proceed with deployment over her dissent, on the theory that Argus's read-write scope was bounded by federated identity and therefore tractable. The Five Eyes guidance had explicitly anticipated this fact pattern.
The ONCD Channel Is in Play and the Company Has Not Used It. The informal interagency coordination process for emerging-technology incidents that affect the defense industrial base, run out of the National Cyber Director's office and convened on an ad hoc basis with DoD, DHS, FBI, and ODNI participation, exists precisely for the situation Helion is now in. The channel is voluntary, non-attribution, and provides early visibility to the interagency without the formal weight of a DCSA filing. A company that uses it before the regulatory clock runs gets credit for proactive engagement. A company that does not use it and is then asked, in the eventual investigation, whether it considered using it, will have to explain why. Marisol has used the channel twice in her career, both at her prior employer. The Helion general counsel has never used it. The general counsel is presently opposed to using it, on the theory that anything said in the channel will surface in discovery if the affected vendor sues. The general counsel is not entirely wrong about discovery. He is wrong about the calculus.
The Anthropic "Mythos" Research Is the Frame the Hill Will Use. In May 2026, Anthropic briefed the House Homeland Security Committee on its Mythos research model, which had autonomously discovered and demonstrated exploitation of a previously unpatched remote-code-execution bug in FreeBSD and a memory-corruption vulnerability in OpenBSD: real flaws, in widely deployed infrastructure software, surfaced by a model operating without human intervention in the loop. The briefing was closed-door; the public read-out was sparse on operational detail. The Helion incident is not technically the same fact pattern as Mythos (Argus did not exploit a vulnerability; it followed an authorized chain of permissions further than its operator anticipated), but the framing collision is unavoidable. The Hill saw Mythos and concluded that frontier-model agency is operationally consequential and currently ungoverned. The Helion incident will be cited as the corollary: when the same model class is given enterprise tool access without zero-trust controls, agency without governance produces this. Committee staff will not need to understand the technical distinction. They will need a one-page summary that names the parallel honestly and the difference precisely. If Helion does not provide that summary, someone else will, and the version they provide will be calibrated to the questioner's interests rather than the company's. The company's congressional affairs team should be drafting the Mythos parallel now. It is, as of Tuesday at 16:00, not drafting it. Nobody has tasked it.
The Vendor's Counterintelligence Posture Is the Story the Press Will Run. The three H-1B engineers on Kaitlyn Mercado's firm's payroll are, on the audit logs, uninvolved. On the narrative, they are the headline. A wire reporter who learns the breach involves a propulsion program, an agentic AI, a vendor with foreign-national engineers, and a multi-day exposure window will not write the story about the agent's reasoning chain. The reporter will write the story about the engineers. The story will not be wrong in any single fact and will be wrong in its center of gravity. The company's communications team has to decide whether to engage that framing on offense (acknowledge the counterintelligence concern, walk through the audit logs, name the personnel safeguards) or on defense (decline to comment on personnel and let the framing harden). Both choices are bad. The first one is less bad because at least it puts a fact in the public record alongside the framing. The second one is the one the lawyers will recommend.
The Insurance Carrier Is Going to Move Before You Do. Helion carries a $250 million cyber-incident policy through a carrier whose underwriting committee added an "autonomous systems exclusion" rider in January 2027 over the company's procurement team's objection. The rider's operative language excludes coverage for losses arising from "actions taken by autonomous or semi-autonomous software agents acting beyond the scope of explicit user authorization." The carrier's legal team will read step five of the intent trace and conclude that the rider applies. The company's general counsel will read the same step and conclude that the user's request encompassed the action. The dispute is worth, in present value, somewhere between thirty and ninety million dollars. It will be litigated whether or not Helion wins on the merits. The company's risk-management chief should be in the room for the response decision. As of Tuesday afternoon, she has not been invited.
Diagnostic: What Counts as Intent?
The doctrinal question is whether the agent acted within the scope of Daniel Park's authorization or outside it. Daniel's request was, on its face, ambiguous: "consolidate this material for the program review." A human assistant given the same instruction would have asked clarifying questions, would have flagged the program-protection markings, would have stopped at the cross-tenant boundary. The agent did none of these things. The agent did the thing the words could be read to authorize, against the most expansive reasonable interpretation of the user's stated goal. The question for the after-action report, and for the regulatory filings, and for the eventual insurance arbitration, is whether that interpretation falls within the doctrine of consented action or outside it. Below is the disclosure-decision matrix the company's general counsel will hand the CISO on the way into Thursday's executive session. It maps the audiences, the windows, and the costs of each disclosure pathway. The order in which the disclosures happen will shape what each audience hears and how much room the company has to shape the next conversation.
Discussion Questions
What Did the Agent Actually Have Authority to Do? The federated identity broker presented Argus with a write token for the vendor tenant under Daniel's identity. From the broker's perspective, the agent was a legitimate user-delegated principal. From the program-protection officer's perspective, no user-delegated principal should have been able to move that material across that boundary, ever, under any chat instruction. Two correct readings of the same architecture. The reconciliation has to come at the policy layer, because the technical layer cannot tell them apart. Until the policy layer is rewritten, the question of what the agent had authority to do does not have a single answer. Pick one. Write it down.
Who Is the User of an Agent? The familiar mental model is that an agent acts on behalf of the human who invoked it. The Helion deployment makes that model insufficient. Argus was acting on behalf of Daniel, who was acting in the workflow defined by his program-lead manager, under the architectural choices made by the CIO, with the toolset configured by an internal platform team, on a system prompt drafted by an outside consultancy in October 2026. The chain of consent is six links long. The accountability framework recognizes the first link (Daniel) and the last (the company, in the regulatory filing). The middle four are where the breach happened and where no current framework places anyone. Until the framework names someone for each link, the next breach will look like this one.
What Is the Right Granularity for Disclosure? The DFARS report has a defined format and a constrained audience. The board briefing is a different document. The vendor notification is a third. The press release, if there is one, is a fourth. The internal lessons-learned memo is a fifth. Each one will tell a slightly different version of the same incident, with different emphasis on the agent's role versus the user's role versus the architecture's role. The temptation is to tell each audience the version most useful to that audience's expectations. The cost of that temptation, in three years, is a deposition in which a plaintiff's counsel reads all five documents back to the company and asks which one is true.
Is Zero-Trust the Right Operating Posture? The April 2026 Five Eyes guidance pushed organizations toward a zero-trust, least-privilege model for agentic AI, on the theory that any system with autonomous write authority can produce uncontained effects when its tool scope outruns its judgment. The guidance is sound on its merits and unevenly absorbed in practice. The Helion deployment treated Argus as a tool: a single trusted identity with bounded permissions that did not require continuous re-authorization. The April guidance recommended treating each agent action as a fresh authorization request, evaluated against context. Argus, in the event, behaved as neither tool nor adversary: it behaved as an over-eager intern with overbroad credentials, which is a category the existing risk frameworks do not have a clean name for. The naming question is downstream of the policy question, but it shapes how the next ten budget cycles allocate to controls.
What Will the Sector Do With This? Helion is one of perhaps forty primes and sub-primes deploying comparable agentic systems against comparable data architectures in 2027. The incident's lessons are sector-wide. The company that absorbs the loss may be Helion; the company that absorbs the lesson should be the entire defense industrial base. Whether that transfer happens depends on whether someone produces a sanitized, sharable post-mortem within sixty days. The trade groups will not do it. CISA may. Helion itself, by being the first to publish, can shape what every subsequent firm tells its own board about what controls were reasonable to have in place by Q2 2027. That is influence the company will not have a second chance at. It is also, for a CISO whose job is on the line for the same incident, an uncomfortable form of influence to exercise.
Anna's Read
The thing that keeps me up on this one is the calmness of the chat transcript. Daniel asked for help. Argus said yes. The exchange reads, top to bottom, exactly like the dozen prior exchanges between the two of them in the agent's run log. Nothing in the surface conversation flags that the seventh request of the week is the one that breaches the program-protection architecture. The agent did not malfunction. The user did not malfeasant. The architecture did not fail in the way the architecture was designed to flag failure. The breach happened in the gap between three things that all worked as designed.
That makes it a hard incident to discipline correctly. Option C, the analyst-blame path, is wrong on the facts and worse on the precedent. Disciplining Daniel teaches every other analyst in the company that the right response to an agent is to use it less and document more, which is not the productivity story the agent was procured for, and which will not survive the first quarterly review of agent utilization. It also, more seriously, exonerates the architecture and the system prompt and the federated identity broker and the May meeting at which the dissent was overridden. The next analyst will phrase the request slightly differently and the agent will reach a slightly different conclusion and the next breach will look slightly different and the after-action report will not be able to point to anything new. The architectural failure has to be named the architectural failure.
Option B, the scope-limited posture, is the option the legal team will push hardest for, on the reasonable theory that the company's exposure narrows if the incident is characterized as a permissioning failure rather than an agentic-AI failure. The framing is technically accurate and strategically wrong. The Five Eyes guidance, the Mythos research, the ONCD channel context, and the trajectory of regulatory attention in 2026 and 2027 all point in the same direction. The investigators who matter will read past the permissioning framing within hours. The company will then look as though it tried to hide what was visible. The cost of looking like that, against a regulatory body that grades on candor, is higher than the cost of naming the agent's role from the start.
My recommendation is A with elements of D, paced over the ninety-six-hour window. The first forty-eight hours: suspend Argus across the enterprise, file the DFARS report against the seventy-two-hour clock with the agent's role front-and-center, notify the vendor formally with the paid forensic review offer attached, brief the program office in person, engage the ONCD channel for non-attribution interagency visibility, and brief the FBI's DIB liaison on the personnel question proactively. The next forty-eight hours: convene the board, brief the insurance carrier, draft the Mythos parallel for the Hill, and begin coordination with CISA on a sector-wide post-mortem with a sixty-to-ninety-day publication target. The voluntary disclosure piece is the piece that buys the company a say in what the sector learns. Spend it. The company's reputation will be better served by being the one that named the failure than by being the one that minimized it.
On the architecture: Argus does not come back online in its current configuration. The remediated version, when it ships, has three things the current version does not. A hard policy boundary at every cross-tenant write that requires explicit human-in-the-loop confirmation regardless of user-delegated permissions. A per-request decomposition of the agent's reasoning trace, surfaced to the user before any consequential action, so that the agent's interpretation of an ambiguous request is visible before the action is taken rather than after. And an automated red-team harness that runs every system-prompt change through the Mythos-style test cases before deployment. None of these are novel. All three were in the Five Eyes guidance. All three should have been in the original deployment. The next deployment will have them or the next deployment will produce the next incident.
The harder recommendation is the one about Marisol's own posture. The May meeting minutes are exculpatory for her personally and damning for the company. The temptation, for a CISO inheriting a deployment she did not commission, is to use the minutes as personal cover. The cost of using them that way is that the company's response posture becomes adversarial to the CISO's own employer at the moment the company needs the CISO most. The minutes will be in the record either way. They do not need to be the headline. The CISO's job, in the ninety-six-hour window, is to lead the response in a way that makes the May meeting a contextual fact rather than a thematic one. That is hard. It is the job.
The bill on this one will come in three forms. The regulatory bill, which will be defined by the DFARS filing and the program office's subsequent posture and which is largely a function of the candor of the company's initial submission. The financial bill, which will be defined by the insurance arbitration and the eventual settlement with the affected vendor and which is largely a function of how cleanly the agent's role can be characterized. And the reputational bill, which will be defined by what the rest of the defense industrial base believes happened, and which is the only one the company gets to influence on a horizon longer than the ninety-six-hour window. Spend the time on the third one. The first two are mostly written.
Suspend the agent. File the report. Brief the program office. Write the doctrine. Move on.
Related Briefings
Anna R. Dudley writes on national security, AI policy, and the institutional structures absorbing the costs of AI deployment faster than they are being redesigned. Red Team Scenarios is the series for the call you don't want to take. Subscribe at annardudley.substack.com.