AI Security Year in Review: 2025

2025 was the year AI security stopped being primarily a research area and became a practitioner problem. The shift was driven by the deployment scale reaching the threshold where operational attacks became financially worthwhile.

This is the review of what mattered.

1. Attacks went operational

The defining shift of 2025: the attack classes documented in research — prompt injection, jailbreaks, model extraction — transitioned from theoretical demonstrations to documented operational incidents.

By mid-2025, incident databases were tracking multiple cases per month of LLM applications being exploited for: unauthorized data access via prompt injection in enterprise tools, fraud facilitation via jailbreaks in customer service bots, and data exfiltration via model memorization exploits.

The incidents weren’t sophisticated. Most involved well-documented attack classes against applications with no injection detection layer. The attackers were opportunistic. The lesson: documented vulnerabilities without deployed defenses become operational incidents when deployment scale is large enough.

2. The supply chain attack surface became real

By 2025, supply-chain attacks targeting ML infrastructure had clearly moved from hypothetical to a recognized, recurring threat class. The representative patterns (described here as classes, not as specific attributed incidents — verify any specific against its primary disclosure):

Malicious or typosquatted PyPI/npm packages aimed at ML engineers (similar-name and install-time-payload attacks)
Tampered model weights distributed through public model hubs, including look-alike or namespace-confusion repositories
Credential-exfiltration payloads embedded in ML-adjacent JavaScript/Python packages

The ML supply chain is structurally more exposed than traditional software because the dependencies are larger (model weights are GBs, not KBs), provenance verification is less mature, and the install patterns (developers pulling new models frequently) create more attack surface.

The response from the ecosystem was slow but real. Hugging Face improved organization account security and model signature verification. PyPI implemented new policies for high-impact package namespaces.

3. Regulatory enforcement reached AI

2025 was the first year with meaningful regulatory enforcement actions touching AI specifically:

FTC enforcement actions against multiple companies for overstated AI safety claims
ICO (UK) issued guidance with enforcement implications for LLM data processing under GDPR
EU AI Act high-risk requirements came into force for the first categories
NIST published mandatory security frameworks for AI in federal government use

The enforcement actions were mostly civil penalties and consent decrees rather than criminal proceedings, but they established precedent: AI safety claims are subject to truth-in-advertising standards, AI applications processing personal data have GDPR obligations, and AI used in high-risk decisions requires documentation.

4. Model hardening genuinely improved

The honest counterpoint: the major model providers got significantly better at resistance to common attacks.

Persona jailbreaks that worked against GPT-3.5 in 2023 fail reliably against GPT-4 class models in 2026. The DAN jailbreak family is effectively dead against frontier models. Many-shot jailbreaking has been partially mitigated.

The improvement is real but asymmetric: hardening improved most for well-documented attack classes on frontier commercial models. Open-weight models, fine-tuned models, and novel attack classes remain more vulnerable.

5. The detection tooling market matured

2025 produced a recognizable category of AI security tooling. At the end of 2024, the options were mostly research frameworks with production aspirations. By the end of 2025:

Dedicated injection detection tools (Lakera, Rebuff) had production deployments at scale
LLM observability platforms (Arize, LangSmith, Helicone) were tracking millions of requests
Security scanning tools (Garak, PyRIT) had CI/CD integration paths

The tooling isn’t complete — coverage gaps remain significant, particularly for novel attacks and non-English content. But the category exists and is producing differentiated products.

What to watch in 2026

Multi-modal attacks (vision model injection) are early-stage in 2025; expect development in 2026
AI-assisted social engineering continues to scale; phishing quality will improve
Regulatory enforcement will expand beyond FTC civil penalties to include sector-specific enforcement in healthcare and finance
Open-weight model security becomes the next frontier — the commercial model hardening story doesn’t apply to the billions of open-weight model deployments

For ongoing coverage: aisecdigest.com ↗ weekly. For incident tracking: aiincidents.org ↗. To build a 2026 reading feed around the threats above, try the personalized AI-Sec digest builder or browse the topic and resource library.

AI Security Year in Review: 2025

1. Attacks went operational

2. The supply chain attack surface became real

3. Regulatory enforcement reached AI

4. Model hardening genuinely improved

5. The detection tooling market matured

What to watch in 2026

Sources

AI Sec Digest — in your inbox

Related

Understanding the OWASP LLM Top 10: What Matters Most

OWASP Top 10 for LLM Applications: The 2025 List Explained

What Is a Prompt Injection Attack? Definition, Types, and Defenses

Comments