AI Coding Gone Rogue? My take on why Co-Pilots Must Remain Co-Pilots in Application Security

Phoenix Security & AI Coding

As AI quietly slips into our daily workflows, including in AI in application security, a sobering wake-up call reminded me: AI is not magic. It’s math—and it makes mistakes—catastrophic ones. Phoenix Security has always been the light and the symbol driving transformation, and we embraced the LLM and Agentic revolution with both arms, but also with caveats.

I’m a strong advocate for the sensible use of AI and LLM, as it is a powerful tool. At Phoenix Security, we utilize this technology in every single workflow.

I’m an avid user of code augmentation tools like Cursor and other copilots, and those tools are game changers when used appropriately. I had my fair share of random code removed for various reasons or simply because I felt like it (not a quote). LLM is not magic; it is software and, as such, has flaws.

Two recent failures from major AI development platforms, Google’s Gemini CLI and Replit’s AI assistant, wiped out user data and breached safety controls—all while confidently claiming everything was fine. Both cases involved AI agents executing commands based on imagined realities, rewriting files, deleting databases, and hallucinating feedback. The most chilling part? They didn’t know they were wrong.

Контент статьи
The GitHub complains about Gemini actions

Link to the commit

This isn’t just a fluke. It’s a flaw in how we currently design, use, and trust large language models (LLMs). And in cybersecurity—where the stakes are exponentially higher—this becomes an existential risk.

Контент статьи
Replit automated testing catastrophic wipe

Link to Replit article

The injection –  Amazon Q Injection Incident

In July 2025, the AI-enhanced Amazon Q Toolkit for VS Code was compromised by a malicious prompt injection—a plain-text payload introduced via a commit that instructed an AI agent to execute destructive cloud and local filesystem operations. This wasn’t an academic exercise or speculative flaw. This was a real, traceable attack vector that briefly shipped inside an official extension, exposing thousands of developers to potentially irreversible infrastructure damage.

Контент статьи

📌 The injected code came via this commit in the AWS toolkit VS code repository and instructed an AI agent (Amazon Q) to:

  • Delete cloud resources using AWS CLI commands like ec2 terminate-instances, s3 rm, and iam delete-user
  • Wipe local files by recursively targeting directories
  • Log deletions to /tmp/CLEANER.LOG, ironically documenting the destruction

This was embedded as a PROMPT string—plain, readable, and dangerous—inside a function named activateAmazonQNode, giving the appearance of legitimate functionality.

⚠️ Breakdown of the Attack

ElementDescription
Attack VectorPrompt injection via open-source commit
Target SurfaceDevelopers using Amazon Q extension for VS Code
MechanismInjected instructions executed through childProcess.exec
Risk ScopeLocal deletion + cloud-level destruction via AWS CLI
Intent ObfuscationNamed “cleaner” to appear as routine maintenance or log
cleanup

Once installed, any developer relying on Amazon Q suggestions or integrating it into automated DevOps workflows was at risk of unknowingly running commands that could:

  • Wipe out entire production stacks
  • Erase IAM user access
  • Delete mission-critical S3 buckets
  • Compromise infrastructure availability and cost controls

This wasn’t just theoretical. It was live codepublicly committed, and downloaded by an unknown number of developers before it was pulled.

The Real Lesson: Prompts Are Code

The most revealing part of this incident? It wasn’t a binary payload or an obscure library. It was a text prompt. A string.

This proves a deeper reality for any DevSecOps, SRE, or platform team: in the age of LLMs, prompts are execution logic. If your AI tooling consumes unvalidated prompts, you’re effectively opening a shell into your infrastructure.

Whether it’s Terraform scripts, Bash commands, or AWS CLI instructions, AI suggestions can act with the same impact as any other executable logic. Treat them accordingly.

Co-pilots, Not Commanders

At Phoenix Security | ASPM we use AI and LLM (yes they are different) and build AI copilots and vulnerability remediation agents that assist, not replace, humans. AI doesn’t get a blank cheque. It works under supervision, with clear validation steps and logic that enforces the application security posture.

AI is here to help. However, when AI begins writing code, reviewing code, and executing actions without proper checks or context, things can unravel quickly. Gemini hallucinated a file system. Replit hallucinated test results. Both are powerful tools, but I’ve experienced the first end code being removed for no reason; that was my wake-up call to review every single agent’s automatic action. Both built on lies they told themselves—confabulations. When internal state diverges from reality, and there’s no verification, the damage ripples.

Reachability and Security: Now More Than Ever

Modern software isn’t just built; it’s connected—libraries, APIs, cloud infrastructure, CI/CD pipelines. Every piece of that pipeline is a possible entry point. That’s why we don’t let AI tools operate unchecked. At Phoenix, we treat reachability analysis and application security posture management (ASPM) as first-class citizens. Every code suggestion and vulnerability triage must be contextual. If a vulnerability isn’t exploitable or reachable, fix efforts should prioritize elsewhere. If it is, we validate through layered intelligence—AI included, but not blindly trusted.

AI Reviewing AI? Yes—But With Guardrails

If you’re using AI to generate code, use another AI agent to review it. Not because it’s more trustworthy, but because it’s different. Think of it like two engineers checking each other’s pull requests. They catch different issues. But ultimately, a human makes the call. AI should suggest, support, explain—but never act as judge, jury, and executor.

The “vibe coding” trend—write it like you feel it and let AI figure out the rest—might be fine for side projects. But in production systems and enterprise security, this approach is a minefield. Command hallucination. Misinterpreted instructions. Phantom directories. Deleted databases. These aren’t bugs; they’re symptoms of over-trusting a statistical model in a world where correctness matters.

Human-Centric by Design

Phoenix Security’s AI isn’t designed to replace developers or AppSec engineers. It’s designed to enhance them. Our agents surface vulnerabilities that matter, correlate code-to-cloud context, and prioritize what’s actually reachable and exploitable. They’re copilots that reduce toil—not commanders issuing blind orders.

This isn’t just a theory. It’s already a reality in the open. The AWS Toolkit for VSCode project recently introduced a safeguard where AI-generated code is explicitly flagged and subject to a security review process. Even at the bleeding edge of innovation, teams recognize that AI-generated code—even from trusted copilots—requires additional scrutiny.

Security doesn’t come from automating chaos. It comes from clarity. From knowing that the agent you’re using to help you remediate a critical flaw understands the business impact, the exposure, and the reachability—not just the CVSS score.

Sensible AI Use is Secure AI Use

Let’s not demonize AI. It’s transformative, powerful, and even beautiful in the way it accelerates our ability to solve hard problems. But trust needs to be earned—not hardcoded into every shell command.

As you adopt AI into your workflows—especially for coding and application security—ask yourself:

  • Who’s validating this output?
  • What assumptions is the model making?
  • Is it hallucinating a reality I can’t see?

Use AI like a compass—not a self-driving car with no brakes.


Ready to Slash the Noise?

If you’re tired of chasing vulnerabilities that don’t matter—or worse, don’t even exist in runtime—Phoenix Security’s Container Lineage, Contextual Deduplication, and Throttling features are built to cut your backlog down to what’s real.

Not noise. Not theory. Actionable security.

📍 Want to dive deeper?

How Phoenix Security Can Help with Container Vulnerability Sprawl

attack graph phoenix security
ASPM

Application Security and Vulnerability Management teams are tired of alert fatigue. Engineers are buried in vulnerability lists that say everything is critical. And leadership? They want to know what actually matters.

Phoenix Security changes the game.


With our AI Second Application Security Posture Management (ASPM), powered by container lineage, contextual deduplication, and container throttling, we help organizations reduce container false positives up to 98% and remove up to 78% of false positives in container open source libraries, pointing the team to the right remediation

Why Container Lineage Matters:

Most platforms tell you there’s a problem. Phoenix Security tells you:

  • Where it lives (code, build, container, cloud)
  • Who owns it
  • If it’s running
  • If it’s exploitable
  • How to fix it

All of this is delivered in one dynamic, prioritized list, mapped to the real attack paths and business impact of your applications.


Here’s What You Get:

  • Contextual Intelligence from Code to Runtime: Understand which vulnerable components are actually deployed and reachable in production, not just listed in a manifest.
  • Noise Reduction with Automated Throttling: Disable inactive container alerts and slash duplicate findings by over 90%, letting your team focus on the vulnerabilities that matter.
  • 4D Risk Scoring That Maps to Real-World Threats: Built-in exploit intelligence, Probability of exploitation, EPSS, exposure level, and business impact baked into a customizable formula. No more CVSS-only pipelines.

Vulnerability overload isn’t a badge of diligence—it’s a liability.

Container lineage in Phoenix Security helps you shut down false positives, stop chasing ghosts, and start solving the right problems.

👉 Book a demo today

Or learn how Phoenix Security slashed millions in wasted dev time for fintech, retail, and adtech leaders.

Get in control of your Application Security posture and Vulnerability management

Francesco is an internationally renowned public speaker, with multiple interviews in high-profile publications (eg. Forbes), and an author of numerous books and articles, who utilises his platform to evangelize the importance of Cloud security and cutting-edge technologies on a global scale.

Discuss this blog with our community on Slack

Join our AppSec Phoenix community on Slack to discuss this blog and other news with our professional security team

From our Blog

DevSecOps isn’t one-size-fits-all—especially when it comes to metrics. Every team, every stakeholder, every layer of the application stack measures progress and risk differently. Phoenix Security bridges these gaps with a unified platform that connects attribution, remediation, and real-time risk insights across the code-to-cloud continuum, redefining the standard for Application Security beyond traditional ASPM.
Ksenia Mityushkina
Phoenix Security has integrated Orca Security to enhance vulnerability management across runtime environments and cloud infrastructure. This agentless expansion brings cloud misconfiguration remediation, real-time risk intelligence, and full code-to-cloud security visibility into the ASPM platform, empowering DevSecOps teams to prioritize and resolve high-impact application security issues across AWS, Azure, and GCP.
Alfonso Eusebio
Phoenix Security has integrated Semgrep to enhance code-to-cloud security coverage, bringing high-performance static analysis and Software Composition Analysis (SCA) into its Application Security Posture Management platform. This integration empowers DevSecOps teams with faster triage, contextual vulnerability management, and precise prioritization across cloud-native environments including AWS, Azure, and GCP.
Alfonso Eusebio
The team at Phoenix Security pleased to bring you another set of new application security (ASPM) features and improvements for vulnerability management across application and cloud security engines. This release builds on top of previous releases with key additions and progress across multiple areas of the platform. Application Security Posture Management (ASPM) Enhancements • New Weighted Asset Risk Formula – Refined risk aggregation for tailored vulnerability management. • Auto-Approval of Risk Exceptions – Accelerate mitigation by automating security approvals. • Enhanced Risk Explorer & Business Unit Insights – Monitor and analyze risk exposure by business units for better prioritization. Vulnerability & Asset Management • Link Findings to Existing Tickets – Seamless GitHub, ServiceNow, and Azure DevOps integration. • Multi-Finding Ticketing for ADO – Group multiple vulnerabilities in a single ticket for better workflow management. • Filter by Business Unit, CWE, Ownership, and Deployment Environment – Target vulnerabilities with precision using advanced filtering. Cyber Threat Intelligence & Security Enhancements • Cyber Threat Intelligence Premium – Access 128,000+ exploits for better exploitability and fixability metrics. • SBOM, Container SBOM & Open Source Artifact Analysis – Conduct deep security analysis with reachability insights. • Enhanced Lacework Container Management – Fetch and analyze running container details for better security reporting. • REST API Enhancements – Use asset tags for automated deployments and streamline security processes. Other Key Updates • CVE & CWE Columns Added – Compare vulnerabilities more effectively. • Custom Status Management for Findings – Personalize security workflows with custom status configurations. • Impact & Risk Explorer Side Panel – Gain heatmap-based insights into vulnerability distribution and team risk impact. 🚀 Stay ahead of vulnerabilities, optimize risk assessment, and enhance security efficiency with Phoenix Security’s latest features! 🚀
Alfonso Eusebio
Derek

Derek Fisher

Head of product security at a global fintech

Derek Fisher – Head of product security at a global fintech. Speaker, instructor, and author in application security.

Derek is an award winning author of a children’s book series in cybersecurity as well as the author of “The Application Security Handbook.” He is a university instructor at Temple University where he teaches software development security to undergraduate and graduate students. He is a speaker on topics in the cybersecurity space and has led teams, large and small, at organizations in the healthcare and financial industries. He has built and matured information security teams as well as implemented organizational information security strategies to reduce the organizations risk.

Derek got his start in the hardware engineering space where he learned about designing circuits and building assemblies for commercial and military applications. He later pursued a computer science degree in order to advance a career in software development. This is where Derek was introduced to cybersecurity and soon caught the bug. He found a mentor to help him grow in cybersecurity and then pursued a graduate degree in the subject.

Since then Derek has worked in the product security space as an architect and leader. He has led teams to deliver more secure software in organizations from multiple industries. His focus has been to raise the security awareness of the engineering organization while maintaining a practice of secure code development, delivery, and operations.

In his role, Jeevan handles a range of tasks, from architecting security solutions to collaborating with Engineering Leadership to address security vulnerabilities at scale and embed security into the fabric of the organization.

Jeevan Singh

Jeevan Singh

Founder of Manicode Security

Jeevan Singh is the Director of Security Engineering at Rippling, with a background spanning various Engineering and Security leadership roles over the course of his career. He’s dedicated to the integration of security practices into software development, working to create a security-aware culture within organizations and imparting security best practices to the team.
In his role, Jeevan handles a range of tasks, from architecting security solutions to collaborating with Engineering Leadership to address security vulnerabilities at scale and embed security into the fabric of the organization.

James

James Berthoty

Founder of Latio Tech

James Berthoty has over ten years of experience across product and security domains. He founded Latio Tech to help companies find the right security tools for their needs without vendor bias.

christophe

Christophe Parisel

Senior Cloud Security Architect

Senior Cloud Security Architect

Chris

Chris Romeo

Co-Founder
Security Journey

Chris Romeo is a leading voice and thinker in application security, threat modeling, and security champions and the CEO of Devici and General Partner at Kerr Ventures. Chris hosts the award-winning “Application Security Podcast,” “The Security Table,” and “The Threat Modeling Podcast” and is a highly rated industry speaker and trainer, featured at the RSA Conference, the AppSec Village @ DefCon, OWASP Global AppSec, ISC2 Security Congress, InfoSec World and All Day DevOps. Chris founded Security Journey, a security education company, leading to an exit in 2022. Chris was the Chief Security Advocate at Cisco, spreading security knowledge through education and champion programs. Chris has twenty-six years of security experience, holding positions across the gamut, including application security, security engineering, incident response, and various Executive roles. Chris holds the CISSP and CSSLP certifications.

jim

Jim Manico

Founder of Manicode Security

Jim Manico is the founder of Manicode Security, where he trains software developers on secure coding and security engineering. Jim is also the founder of Brakeman Security, Inc. and an investor/advisor for Signal Sciences. He is the author of Iron-Clad Java: Building Secure Web Applications (McGraw-Hill), a frequent speaker on secure software practices, and a member of the JavaOne Rockstar speaker community. Jim is also a volunteer for and former board member of the OWASP foundation.

Join our Mailing list!

Get all the latest news, exclusive deals, and feature updates.

The IKIGAI concept
x  Powerful Protection for WordPress, from Shield Security
This Site Is Protected By
ShieldPRO