AI Security Year in Review: 2025
The five most consequential AI security developments of 2025: the shift from theoretical to operational attacks, the supply chain compromise wave, regulatory enforcement reaching AI, and what actually improved.
2025 was the year AI security stopped being primarily a research area and became a practitioner problem. The shift was driven by the deployment scale reaching the threshold where operational attacks became financially worthwhile.
This is the review of what mattered.
1. Attacks went operational
The defining shift of 2025: the attack classes documented in research — prompt injection, jailbreaks, model extraction — transitioned from theoretical demonstrations to documented operational incidents.
By mid-2025, incident databases were tracking multiple cases per month of LLM applications being exploited for: unauthorized data access via prompt injection in enterprise tools, fraud facilitation via jailbreaks in customer service bots, and data exfiltration via model memorization exploits.
The incidents weren’t sophisticated. Most involved well-documented attack classes against applications with no injection detection layer. The attackers were opportunistic. The lesson: documented vulnerabilities without deployed defenses become operational incidents when deployment scale is large enough.
2. The supply chain attack surface became real
By 2025, supply-chain attacks targeting ML infrastructure had clearly moved from hypothetical to a recognized, recurring threat class. The representative patterns (described here as classes, not as specific attributed incidents — verify any specific against its primary disclosure):
- Malicious or typosquatted PyPI/npm packages aimed at ML engineers (similar-name and install-time-payload attacks)
- Tampered model weights distributed through public model hubs, including look-alike or namespace-confusion repositories
- Credential-exfiltration payloads embedded in ML-adjacent JavaScript/Python packages
The ML supply chain is structurally more exposed than traditional software because the dependencies are larger (model weights are GBs, not KBs), provenance verification is less mature, and the install patterns (developers pulling new models frequently) create more attack surface.
The response from the ecosystem was slow but real. Hugging Face improved organization account security and model signature verification. PyPI implemented new policies for high-impact package namespaces.
3. Regulatory enforcement reached AI
2025 was the first year with meaningful regulatory enforcement actions touching AI specifically:
- FTC enforcement actions against multiple companies for overstated AI safety claims
- ICO (UK) issued guidance with enforcement implications for LLM data processing under GDPR
- EU AI Act high-risk requirements came into force for the first categories
- NIST published mandatory security frameworks for AI in federal government use
The enforcement actions were mostly civil penalties and consent decrees rather than criminal proceedings, but they established precedent: AI safety claims are subject to truth-in-advertising standards, AI applications processing personal data have GDPR obligations, and AI used in high-risk decisions requires documentation.
4. Model hardening genuinely improved
The honest counterpoint: the major model providers got significantly better at resistance to common attacks.
Persona jailbreaks that worked against GPT-3.5 in 2023 fail reliably against GPT-4 class models in 2026. The DAN jailbreak family is effectively dead against frontier models. Many-shot jailbreaking has been partially mitigated.
The improvement is real but asymmetric: hardening improved most for well-documented attack classes on frontier commercial models. Open-weight models, fine-tuned models, and novel attack classes remain more vulnerable.
5. The detection tooling market matured
2025 produced a recognizable category of AI security tooling. At the end of 2024, the options were mostly research frameworks with production aspirations. By the end of 2025:
- Dedicated injection detection tools (Lakera, Rebuff) had production deployments at scale
- LLM observability platforms (Arize, LangSmith, Helicone) were tracking millions of requests
- Security scanning tools (Garak, PyRIT) had CI/CD integration paths
The tooling isn’t complete — coverage gaps remain significant, particularly for novel attacks and non-English content. But the category exists and is producing differentiated products.
What to watch in 2026
- Multi-modal attacks (vision model injection) are early-stage in 2025; expect development in 2026
- AI-assisted social engineering continues to scale; phishing quality will improve
- Regulatory enforcement will expand beyond FTC civil penalties to include sector-specific enforcement in healthcare and finance
- Open-weight model security becomes the next frontier — the commercial model hardening story doesn’t apply to the billions of open-weight model deployments
For ongoing coverage: aisecdigest.com ↗ weekly. For incident tracking: aiincidents.org ↗.
Sources
AI Sec Digest — in your inbox
Curated AI security news, daily. — delivered when there's something worth your inbox.
No spam. Unsubscribe anytime.
Related
Understanding the OWASP LLM Top 10: What Matters Most
OWASP published the LLM Top 10 in 2023 and updated it in 2025. The list is useful but requires interpretation. Here's which items are operationally relevant vs. theoretically important, and what to prioritize.
AI Security News Weekly Digest: What to Track and Where
A practitioner's guide to building and consuming a reliable AI security news weekly digest — covering threat categories, authoritative sources, and the signal worth your attention each week.
AI Security Week: May 6, 2026
Analysis and commentary: AI provider usage-policy direction for security research, multi-modal (image-embedded) prompt injection, AI-security certification trends, and the recurring ML-library CVE classes. Verify any CVE ID or fixed version against NVD/vendor advisories.