The increasing use of artificial intelligence (AI) by cyber threat actors has fundamentally reshaped the cybersecurity landscape. AI is no longer a theoretical threat; it is a powerful tool advancing attackers’ speed, scale, and stealth, making sophisticated cybercrime accessible for both skilled and unskilled actors. This blog explores how AI is being weaponized, highlighting key examples of its use in real-world attacks.
AI’s Strategic Advantage: Speed, Scale, and Stealth
AI provides threat actors with a significant tactical advantage, transforming previously manual operations into highly automated and rapid attacks.
- Social Engineering: AI has ushered in a new era of highly convincing social engineering. Generative AI, especially Large Language Models (LLMs) like WormGPT and FraudGPT, can create personalized, grammatically perfect phishing emails nearly indistinguishable from legitimate communications. The era of poorly written phishing scams is effectively over. Beyond text, AI-generated deepfakes—synthetic videos and audio—are used to convincingly impersonate individuals for financial fraud and corporate scams. As little as a few seconds of audio can be enough to create a highly accurate voice clone.
- Malware and Evasion: AI is being used to create a new malware uniquely designed to evade traditional defenses. This includes polymorphic malware that continuously mutates its code to avoid detection by signature-based antivirus software. Researchers from HYAS Labs demonstrated the “BlackMamba” proof-of-concept (PoC) malware highlighting this threat, with its AI-generated payload changing its hash with every execution, rendering traditional detection methods ineffective. Earlier this year, in July 2025, Ukraine’s Computer Emergency Response Team (CERT-UA) publicly reported on LAMEHUG. This malware is being documented as the first known to integrate large language model (LLM) capabilities directly into its attack methodology. This malware has been reported to have ties to APT28 (Fancy Bear).
Case Studies: Real-World AI Attacks
These threats have been realized in high-stakes, real-world attacks:
- SalesLoft data breach via the Drift AI chatbot (2025): A recent data theft campaign targeted customer instances of Salesforce using compromised OAuth and refresh tokens stolen from the Drift AI chat agent, a third-party application integrated with Salesloft. The attackers, tracked as UNC6395, exploited a security issue within the Drift application to obtain these tokens. Using the stolen credentials, they were able to exfiltrate large volumes of data from various Salesforce objects, including Cases, Accounts, and Users. This incident highlights a new vulnerability where an AI agent with privileged access can serve as an entry point for a large-scale network compromise.
- Hong Kong Video Deepfake Scam (2024): A financial worker at a multinational firm was tricked into transferring over $25 million USD after participating in a video conference with multiple deepfake impersonations of the company’s CFO and other colleagues. The employee’s initial suspicion of a fraudulent email was erased after seeing and hearing the convincing deepfakes on the video call.
- UK Energy Firm Voice Clone (2019): One of the earliest examples dates back to when attackers used AI-generated audio to mimic the voice of a company’s German CEO. The voice clone, complete with the CEO’s accent, instructed an employee to make an urgent wire transfer of over $40,000 to a fraudulent supplier.
The Future of AI Attacks
Currently security researchers are working diligently to discover novel attacks with proof on concepts, attacking real infrastructure and well known LLMs in order to see where security tooling has gaps. These discoveries are critical to highlight where defense can be improved, as well as new use cases to consider for threat actors to leverage in the future.
- LegalPwn, Misclassification of Malware by GenAI Tools (2025): A novel proof of concept prompt injection technique which manipulates popular AI models by hiding malware within fake legal disclaimers, as AI models are trained to respect legal-sounding text. By exploiting legitimate overlooked textual components like legal disclaimers, terms of service, and privacy policies to manipulate LLMs in order to bypass AI-driven security analysis or being able to execute malicious code.
- AI Powered Ransomware, PromptLock (2025): ESET researchers discovered the first known AI-powered ransomware through malware samples in VirusTotal. Known as PromptLock, it uses OpenAI’s gpt-oss:20b model and is currently thought to be a proof of concept. Found to be cross platform compatible, Lua scripts are generated from hard-coded prompts, to execute malicious code for searching through an infected computer, steal files, and perform encryption.
- Man in the Prompt Attacks (2025):This proof of concept attack leverages the ubiquity and broad privileges of browser extensions to silently compromise both commercial and internal GenAI tools, enabling the extraction of highly sensitive corporate data. The exploit affects top commercial LLMs like ChatGPT, Gemini, Copilot, and Claude, as well as enterprise LLM deployments (custom copilots, RAG-based search assistants) and AI-enabled SaaS applications. The inability of current security solutions to detect these DOM-level interactions creates a critical blind spot, exposing organizations to significant risks of IP theft, regulatory penalties, and sensitive data exposure.
Frameworks for Defense
The MITRE ATLAS (Adversarial Threat Landscape for Artificial-Intelligence Systems) framework is a comprehensive, living knowledge base which documents adversary tactics and techniques used against AI-enabled systems. Modeled after the well-known MITRE ATT&CK® framework, ATLAS focuses specifically on vulnerabilities unique to machine learning and AI, such as data poisoning, model evasion, and model theft, and includes real-world case studies.
Standardization of compliance for AI security in regulated spaces (finance, government, healthcare) will likely also have to adhere to NIST’s AI Risk Management Framework.
Organizations can leverage these frameworks, and others like it, to enhance their defenses by using it for threat modeling, which helps identify and prioritize AI-specific risks and also provide a shared language for cybersecurity and data science teams to discuss and formulate security strategies. By using threat modeling frameworks, organizations can conduct adversarial testing to find vulnerabilities before an attacker exploits them and develop new detection and mitigation strategies based on a standardized approach to AI security.
The AI Double-Edged Sword: Fighting Back with AI
The fight against AI-powered threats is a new kind of “AI arms race.” The same technologies used for offense are now being used to create more resilient defenses.
- AI-Powered Defense: Security teams are adopting AI-powered platforms to move beyond outdated signature-based security. These systems use User and Entity Behavior Analytics (UEBA) to establish a baseline of normal user and network behavior. By detecting subtle deviations from this baseline, they can identify and respond to new, evasive threats lacking a pre-existing signature.
- Proactive Defense: AI can also be used to create realistic simulations of cyberattacks to help security teams test their defenses and identify vulnerabilities before they are exploited by a real attacker.
Mitigations for prompt injection attacks (LegalPwn, Man in the Browser):
- Guardrails Against Prompt Injection Attacks: Implement AI-powered DLP guardrails specifically designed to detect and neutralize prompt injection attempts, even when they are embedded within seemingly legitimate text.
- Human-in-the-Loop Review: For applications involving high stakes, maintain a human oversight layer to review LLM outputs, particularly when processing new or external data sources.
- Adversarial Training: Incorporate LegalPwn, Man in the Browser attacks scenarios into the training data of LLMs to enhance their robustness against subtle injections.
- Enhanced Input Validation: Implement more sophisticated input validation mechanisms going beyond simple keyword filtering to analyze the semantic intent of text.
- Strong System Prompts: Providing a system prompt which explicitly alerts the LLM to potential prompt injections and prioritizes security over user intent can significantly improve model resistance to overt manipulation. Such prompts were shown to be highly effective against basic injections, though not foolproof against more obfuscated payloads.
In conclusion, the integration of AI into the malicious toolkit of cyber threat actors has rendered traditional, signature-based defenses increasingly obsolete. To build true cyber resilience, organizations must adopt a multi-layered approach including continuous employee education to recognize new AI-driven social engineering tactics, implementation of multi-channel verification protocols for high-value transactions, and the adoption of AI-aware defensive technologies.
↑
Share