Context Engineering + Cybersecurity: The Good, The Bad, and The Ugly

Why Smart Systems Need Smarter Defenders

Jul 08, 2025

If your Substack feed has been anything like mine lately, you've probably been flooded with takes on context engineering. They’re all probably half thoughtful, half suspiciously enthusiastic, and all convinced it's the next frontier.

To be honest, I get it. Context is where the magic happens. The leap from "AI that responds" to "AI that remembers" has been fast, weird, and depending on who you ask, either brilliant or terrifying (or sycophantic).

We're in the middle of a shift. AI is no longer just answering questions or completing prompts. It's persisting memory, summarizing history, retaining state, and adapting behavior. That's context engineering. And as with any good paradigm shift, it's already crashing into cybersecurity.

At first, I thought this would be a quiet evolution: smarter alert triage, maybe a few automation pipelines that remembered how the last incident was handled. But what I'm seeing now, and what others are starting to notice, is the rise of agentic cybersecurity like security platforms that don't just react…they contextualize, weigh, and predict. It's promising, it's powerful, and yes, it's potentially dangerous.

It’s almost becoming tradition, but with all the context engineering talk going on right now, we're going to unpack it through a cybersecurity lens: the good, the bad, and the contextually very ugly.

Why Context Engineering Matters in Cybersecurity

Let's strip the jargon for a second. At its core, context engineering is the act of designing and managing what a system knows, what data it sees, how it remembers things, and what it considers when making decisions.

In the AI world, it's about persistent memory, dynamic state windows, retrieval-augmented generation, and more. In cybersecurity, we've been doing our own flavor of this for years, though we didn't call it that. Remember UEBA, those behavioral baselines for users and endpoints? There’s also incident correlation engines, threat scoring algorithms, detection logic that references historical data instead of just raw events.

What's new is the convergence, and honestly, it's happening faster than I expected. The best of AI, like multi-turn reasoning, memory, adaptive prioritization, is being fused with security in ways that feel both inevitable and slightly unsettling. Take the AI-SOC hype cycle we’re in…these systems don’t just responding to alerts. They reason through narrative, and look back at past incidents to inform responses.

Look, I'm the first person to geek out about an incident investigation assisted by an AI that remembers what I did last week, or even what seemed suspicious about that lateral movement attempt from three months ago. But as we found out with vibe coding and AI-assisted development…and really, as we should have learned from every other time we've rushed to automate complex decision-making…what begins as a productivity boost can quickly become a security blind spot if we don't design these systems with intent, and maybe a healthy dose of paranoia.

What's Different This Time

Autonomy.

Not to be trite here, but the longer answer is that context no longer lives in dashboards and SIEMs alone, which was already complicated enough when you had to worry about data retention policies and correlation rule tuning. Now it flows through AI agents, detection pipelines, orchestration engines, and enrichment layers. It lives in memory tokens, in embeddings, in replay buffers, in whatever new abstraction layer someone deployed last week. It's everywhere, and honestly, that's both the promise and the problem.

Here's where things get interesting and weird/exciting.

In a traditional rule-based system, if a detection triggers, we know why. There's logic, thresholds, pattern matching. But in a context-rich system, the decision might be based on prior incidents, learned behavior, sentiment analysis, or natural language queries against historical observables.

We're no longer dealing with binary logic, which was already hard enough to debug when your rule engine decided that every DNS query from accounting was suspicious. We're dealing with influenced logic (the kind shaped by memory, history, proximity, and occasionally, what I can only describe as a wild guess dressed up as correlation). You ask your AI why it escalated that alert, and it's like asking a cat why it knocked your coffee off the desk. There's a reason, probably even a good one, but you're just not allowed to know it, and the cat isn't particularly interested in explaining.

And as soon as context becomes an input, it becomes a potential attack surface. Which brings us to...

The Good, The Bad, and The Ugly of Context Engineering

The Good

Let's give credit where it's due. Context-aware cybersecurity systems are game-changers. They help us:

Reduce alert fatigue by prioritizing events based on real relevance, not just static severity
Correlate incidents more effectively, linking related activity across time and sources
Accelerate threat investigation by enabling memory in detection agents and playbooks
Personalize response based on historical analyst decisions, playbook performance, or asset value

When implemented well, these systems act like a seasoned analyst, one who knows what's important, remembers how the last breach unfolded, and makes decisions in context, not in isolation.

The Bad

But with great context comes great drift (or at least the likelihood of it).

Context can decay, and not in the graceful way where you get a nice deprecation notice. Memory can get stale. Baselines shift without warning. And unless your system is designed to detect that drift…which, let's be honest, most aren't, you could end up with overfitting to history, missing novel threats because they don't match past patterns, assumption bias where past analyst decisions over-influence future automation, or context collisions where concurrent incidents pollute the decision state of the system in ways that would make a database administrator weep.

And if your telemetry is spotty or your correlation logic is flawed? Your context layer inherits those flaws (just with more confidence and a better vocabulary).

The Ugly

Here's where it gets fun. And by fun, I mean concerning.

The real risk of context engineering isn't just mistakes. It's manipulation.

Attackers can:

Poison baselines through low-and-slow activity that becomes normalized over time
Shape detection logic by generating activity that feeds misleading context into the system
Inject prompts into LLM-powered workflows that alter downstream behavior

Forget slopsquatting, attackers can implement a new form of supply chain threat vector where context-based systems point network defenders in the wrong direction or insidiously poison your detection logic.

Security Implications: Where Context Can Go Very Wrong

There's something uniquely dangerous about systems that are too confident in what they think they know. And context engineering, for all its promise, invites a very specific class of security risk. These aren't zero-days or traditional exploits. They're structural failures of assumption.

Here are five implications I think every network defender should be watching:

1. Telemetry Poisoning: slow changes in user behavior or endpoint activity can poison a baseline and redefine normal. You didn't fix the threat. You just taught your context to look the other way.

2. Prompt Injection in Security Workflows: Well-crafted comments or input fields can bias memory or decision logic. It's not quite SQL injection, but it rhymes.

3. Stale Context, Wrong Decisions: Making tomorrow's decisions using yesterday's truths is a great way to get blindsided by something obvious.

4. Over-Prioritized Memory: Too much emphasis on a single incident can bias response across the board. You don't want your entire SOC pipeline haunted by one messy insider threat case.

5. Context Replay or Hijacking: If your memory is serialized, stored, or shared between systems, attackers can find ways to reuse it. One compromised buffer and your entire agent starts thinking the wrong things.

…and a bonus one is the systemic rote trust of a system that you do not fully understand.

Mitigation Strategies: Guardrails for Smart Systems

Let's not panic. These risks are real, but manageable, and in my opinion, they’re worth the tradeoff. Here's how to build guardrails for smart systems.

The first thing you need to tackle is context expiry. Not everything should persist forever, which seems obvious until you realize your threat detection AI is still making decisions based on that one weird incident from 2019 that involved someone's compromised laptop and a very confused DNS server. Use time-to-live policies and context aging. Think of it like cache invalidation, but for memories that could bias your entire security posture.

Next up is input validation, and I mean really validate everything at the edge. Never trust a prompt, log line, or incident note that hasn't been filtered, sanitized, or at least given a suspicious look. This isn't just about SQL injection anymore (though that's still a thing). It's about making sure that clever comment in a log file doesn't somehow convince your AI that lateral movement is actually just "enthusiastic network exploration."

You also want to snapshot and compare baselines regularly. If your system's memory is drifting, it should be observable and huntable, not something you discover six months later when you're wondering why your detection rates have mysteriously plummeted. Make deltas visible. Track when your "normal" changes, and have someone (human, preferably) occasionally ask why.

Here's the part that might feel weird: don't treat AI memory like a SIEM. We audit SIEM pipelines, correlation rules, and data flows because we know they're critical infrastructure. Your agentic memory stores are also critical infrastructure now. They deserve the same level of scrutiny, documentation, and governance. Just because it's "just memory" doesn't mean it's not a single point of failure.

And finally, red team the memory layer. Yes, seriously. Add context engineering to your threat modeling process. Can you poison it? Can you make it drift in a direction that benefits an attacker? Can you make it forget something important, or remember something that never happened? If you can't answer these questions, someone else probably already has.

Where Should You Start?

If you're already experimenting with agentic detection, autonomous workflows, or LLM copilots in security, here's what I'd focus on this quarter, and honestly, none of it is particularly glamorous, but it's all necessary.

First, inventory where context lives in your environment. Your SIEM holds memory. Your SOAR platform remembers playbook outcomes. Your enrichment platform builds profiles. Your copilot agents learn from interactions. Your correlation engines maintain state. All of them are holding onto information that influences future decisions, and most of us have never mapped this comprehensively. It's like doing asset discovery, but for organizational memory, and it's probably more sprawling than you think.

Once you know where memory lives, classify it by risk and blast radius. Some memories are cheap. If your chatbot forgets that Bob from accounting asked about password policies last week, the world continues spinning. Some memories are critical. If your detection engine forgets that this IP range belongs to your subsidiary and starts treating their normal traffic as suspicious, you've got a problem. Prioritize your memory protection efforts based on what happens when things go wrong.

Then, designate a memory owner. This might sound bureaucratic, but someone needs to be accountable for what your systems remember and why. This person (or team) should understand both the technical implementation and the business logic. They should know when baselines shift, when context gets updated, and when the AI equivalent of "we've always done it this way" starts creeping into automated decisions.

Finally, train your team on this stuff. Context engineering isn't just a buzzword or something the AI team handles. It's a new domain of risk that spans detection engineering, incident response, and threat hunting. Make sure your people understand how memory works in these systems, what can go wrong, and how to spot when it's happening.

If you're looking for practical starting points, NIST's AI Risk Management Framework provides a solid foundation with its four-function approach: Govern, Map, Measure, and Manage. For teams specifically working with context-rich systems, Anthropic's Model Context Protocol (and there’s a good, bad and ugly about this too) offers standardized approaches to connecting AI agents with external data sources (exactly the kind of infrastructure that needs security consideration). And if you're red-teaming these systems, MITRE ATLAS gives you a taxonomy of adversarial techniques specifically designed for ML environments.

What Comes Next?

If this feels like a lot, it's because it is. We're in the early innings of the context engineering era. AI is changing how we make security decisions. We're no longer just building logic—we're building memory. And with that comes responsibility.

Security teams that succeed in this next chapter won't just write better detections. They'll engineer better context: context that is scoped, validated, observable, and resilient. They'll treat memory not just as a tool for efficiency, but as a critical piece of infrastructure that needs its own threat model.

We've learned to defend identity. We've learned to defend the cloud. Now we've got to learn to defend the story our systems tell themselves about what's happening. Because context (that useful, messy, wonderful context) is now part of the decision layer.

I'm optimistic, I really am, and I believe that context engineering has the potential to make security smarter, faster, and less reactive. Context is powerful, and it doesn't care if it's right. It just cares if it feels consistent.

I'm excited, but I'm also reading every output twice.

Stay curious, stay secure, my friends…and maybe double-check what your AI thinks it remembers.

Damien

Damien’s Substack

Discussion about this post