Trojan Packages: 8 Behavioral Patterns to Unmask Repository Poisoning Attacks

Storm-2372 has been burning through M365 tenants since August 2024. The technique is well understood. Microsoft published the advisory. Splunk shipped a detection rule. Sigma has a signature. We wrote our own. The community responded fast.

So why are organizations with all of this deployed still getting compromised?

Because device code detection coverage breaks at a layer that the content authors don’t control. The detection content works exactly as designed. The rules are fine. Most ingestion pipelines do not deliver the schema that the rules expect. The detection exists. The data exists, but they can’t see each other.

This is a structural gap in how Entra ID sign-in logs reach Splunk, and it affects every team that relies on out-of-the-box detection content without verifying the underlying plumbing.

Who is using this technique and why it matters

Device code phishing is not one campaign. It is a technique that has crossed the state-to-criminal threshold and has been adopted by actors with different motives, targets, and levels of sophistication.

State-aligned espionage. The first in-the-wild device code phishing was attributed to Storm-2372, a Russia-aligned cluster that Microsoft has been tracking since August 2024. Targeted, high-touch operations. Operators build rapport with victims via messaging platforms, posing as prominent individuals before sending a fake meeting invitation that includes a device code. Post-compromise, they use the compromised mailbox to send internal emails containing fresh device codes, enabling lateral movement from the initial access point. Their tradecraft evolved in February 2025, when they shifted to the Microsoft Authentication Broker client ID, chaining device code phishing into device registration and primary refresh tokens (PRT) acquisition to enable persistent access across all Entra ID-connected applications. Several other Russia-aligned clusters have since adopted the same technique.

Phishing-as-a-Service. The clearest signal that device-code phishing has gone mainstream is EvilTokens, the first PhaaS kit purpose-built for this technique. Documented by Sekoia in March 2026, EvilTokens provides turnkey infrastructure for device code generation, victim redirection to Microsoft’s legitimate login page, and token capture. EvilTokens-powered campaigns have hit over 340 organizations across five countries since mid-February 2026.

AI-enabled automation. By April 2026, Microsoft documented a campaign that moved beyond static scripts to fully automated infrastructure. Railway[.]com is spinning up thousands of unique, short-lived polling nodes. Generative AI producing hyper-personalized lures aligned to each victim’s role. Each lure tailored, each polling node ephemeral.

State actors proved the technique works. PhaaS commoditized it. AI scaled it. The constant across all of them is the same field in the same log: authenticationProtocol=deviceCode. If your pipeline does not deliver that field to your detection rule, you are blind to the entire spectrum.

Why this works now and didn’t five years ago

Strip away the campaign details, and a more fundamental shift comes into view. The old game was credentials. Attackers stole passwords. Then we added MFA. Then, the actors built an adversary-in-the-middle to intercept MFA codes. Then we pushed harder on MFA. Twenty years of move and counter-move on the same battlefield.

Device code phishing is on a different battlefield. Nothing is stolen. The user is asked, through a real Microsoft page, to grant access. They enticed the target who complies by saying yes. They complete their MFA because that is what we trained them to do. The attacker walks away with a real, valid, legitimately issued token. The locks were never picked; the user realistically invited them in.

Device code phishing isn’t new. Researchers demonstrated the abuse path in 2019, the same year the RFC was published. Secureworks documented it in the wild by 2021. Targeted operators, such as Storm-2372, were running it by early 2025. The technique worked, but the ecosystem around it didn’t, not at scale. The reason it exploded in 2026 is that the cloud identity providers spent almost a decade building consent infrastructure. Device code flows that allow users to sign in on a TV. OAuth apps that let calendars talk to CRMs. Pre-approved tenant-wide scopes so users do not have to click “allow” for every interaction. All of that was built for legitimate reasons. However, all of it created an attack surface where access can be granted without anything being broken. What EvilTokens added was the last missing piece, just-in-time code generation, which eliminated the 15-minute expiry window that had kept earlier campaigns small.

That is the watershed. The rules of the auth system have shifted. A new class of attack became viable because of how the system was built, not because of AI.

AI helps by creating lures using the victim’s voice and triaging stolen mailboxes in seconds. But it is just a tactical accelerant atop a strategic shift. The same campaign would have worked in 2024 without AI. It would not have worked in 2018 because the consent surface wasn’t in place yet.

This matters for where defending organizations allocate their spending. If you believe the problem is AI, the answer is fancier email filtering and more user training. If you believe the problem is consent abuse, the answer is a conditional access policy. Restrict who can use device code flow. Require phishing-resistant MFA methods that bind to the session and cannot be used to authorize a different one. Enforce device compliance and trusted network conditions on token issuance. Then hunt for what tokens do after issuance. The first set of investments does not address the actual mechanism. The second one does.

The detection content for this technique assumes you have already chosen the second path. The rest of this post is about whether your ingestion pipeline can actually feed it.

The field that makes or breaks detection

Device code authentication detection hinges on a small set of fields within a single log. The Entra ID sign-in log schema. When someone authenticates via the device code OAuth flow, the relevant fields populate. Every published detection for this technique converges on the same telemetry. Splunk’s ESCU rule filters on properties.authenticationProtocol=”deviceCode”. Community Sigma signatures use related fields, such as properties.message=”Device Code”. Sentinel and Elastic queries use their own variants. The implementation varies across the board. However, the dependency does not. All of it requires the Entra sign-in schema to reach the SIEM intact, under a label that the rule can find.

The Unified Audit Log cannot detect this activity, so save yourself the trouble. The UAL’s RecordType 15 (STS Logon) captures authentication events, but it does not include the authenticationProtocol field. This is not a gap you can work around with clever SPL. The field structurally does not exist in that data source. The detection requires the Entra ID sign-in logs specifically.

Our threat research team confirmed that Event Hub and Azure AD sign-in logs contain all the information needed to detect this activity. So we asked the next question. Across our client fleet, how many environments actually deliver that data in a form our detections can reach?

The answer is what prompted this post.

Three things your security team cannot verify without your help.

1. Block device code flow where it is not needed. In Entra ID, open Conditional Access and create a policy targeting All Users, with the Authentication Flows condition set to block the device code flow. Exclude only service accounts or user populations with a documented business need (conference room devices, CLI tooling for engineering teams). If nobody in your organization has a legitimate reason to authenticate this way, block it for everyone.

2. Confirm your sign-in logs are actually being exported. In the Azure portal, navigate to Entra ID > Diagnostic Settings. Verify that both SignInLogs and NonInteractiveUserSignInLogs are enabled and forwarding to wherever your security team consumes them (Log Analytics, Event Hub, or storage account). The NonInteractive table is where device code authentications land after the initial consent. If that table is not being exported, your security provider is blind to the post-auth token activity, regardless of what detections they have deployed.

3. Ask your security provider one question. “Can you show me a device code authentication event in our data from the last 30 days?” Not whether the detection rule exists. Whether the data it depends on is present, parsed, and producing results. If they cannot answer that with evidence, the detection is not operational.

Two layers, both have to be right

Detection fidelity breaks at two distinct layers. Understanding which layer is broken determines whether the fix takes five minutes or becomes an infrastructure project.

Layer 1. Does the field exist in the raw event?

The field’s presence depends on how Microsoft emits the data. Azure Event Hub, fed by Entra Diagnostic Settings, preserves the full sign-in log schema, including authenticationProtocol. It sits nested in properties.authenticationProtocol inside a records[] JSON envelope. The Microsoft Graph API beta endpoint also preserves it, but flat at the top level with no properties. prefix. The Graph v1.0 endpoint does not reliably include the field. Microsoft’s v1.0 contract for /auditLogs/signIns excludes it, and even when tenants see it returned from the live API, the contract guarantees nothing.

If the source does not emit the field, no detection tuning will find it.

Layer 2. Can detection rules reach the event?

Even when the field is present in the raw JSON, the detection rule has to know where to look. Splunk detection rules target a specific sourcetype. If the data lands in the index under a sourcetype name that the rule does not match, the event is invisible. Indexed, searchable, fully intact. The detection passes over it, and that’s a miss that you don’t want to account for.

Source determines field presence. Sourcetype determines detection reach. Both layers have to align for a detection to fire.

Six pipes, five gap categories

Six common sourcetypes carry Entra ID sign-in data into Splunk. Only two of them reliably deliver the field on which every published detection depends.

Sourcetype	Ingestion Method	Add-on / Path	authenticationProtocol Present	Notes
azure:monitor:aad	Event Hub	MSCS (app 3110) with sourcetype override	Yes	Standard path. CIM-mapped, props.conf parsing intact.
azure:aad:signin	Graph API pull	Splunk Add-on for Microsoft Azure (app 3757)	Yes	Standard path. Inputs migrated to MSCS; legacy installs persist.
mscs:azure:eventhub	Event Hub	MSCS (app 3110) without sourcetype override	Yes, in raw. No field extraction.	False coverage. Field exists in the event, but no props.conf stanza parses it. Detections targeting azure:monitor:aad never see this data.
ms:aad:signin	Graph API pull	Older Azure AD TA (pre-migration)	Yes	Legacy. Still present in environments that were onboarded before the MSCS migration.
ms:o365:management	O365 Management Activity API	MSCS (backward compatibility)	No	Sign-in events routed through the O365 management API lack the Entra sign-in schema entirely.
o365:management:activity	O365 Management Activity API	Splunk Add-on for Microsoft Office 365	No	Newer equivalent of ms:o365:management. Same schema limitation.

Our fleet scan across clients classified them into five gap categories. The frequencies below reflect what we observed in our audit, not what may be true across the industry.

False coverage. The most common, and the most dangerous. The client configured an Azure Event Hub, installed the Splunk Add-on for Microsoft Cloud Services (app 3110), and pointed it at the hub. Data flows. Dashboards populate. The schema is complete, including authenticationProtocol. But the sourcetype was left at the add-on’s default value instead of being overridden to azure:monitor:aad.

That override is not cosmetic. The azure:monitor:aad sourcetype ships with purpose-built props.conf entries for timestamp extraction, CIM field-alias mappings, Entra fields to the Authentication datamodel, and eval statements for Enterprise Security compliance. The default sourcetype has none of this. Every detection rule, every correlation search, every datamodel acceleration targeting azure:monitor:aad is blind to data under the default label.

We learned to check this first on every onboarding. The environment looks healthy. Coverage is an illusion. The fix is one line in inputs.conf.

Structural field absence. Splunk Add-on for Microsoft Office 365 (TA-4055). Splunk-maintained. Actively updated. Branded for Office 365. A reasonable choice. For sign-in logs, it is a trap. The add-on calls Graph v1.0. Microsoft’s v1.0 contract does not guarantee authenticationProtocol. No configuration change fixes this. Event Hub standup is required.

Deprecated path. Microsoft Azure Add-on for Splunk (TA-3757), which calls Graph beta and preserves the field. Splunk has deprecated this add-on in favor of MSCS. The field also resides at a different JSON path, so detection rules written for the Event Hub schema do not match it. We learned this the expensive way. A documented upgrade issue in versions 3.0.1 to 3.1.1 silently dropped authenticationDetails, userAgent, userType, and other fields until admins manually flipped the endpoint setting from v1.0 to beta. Tenants running affected versions had blind spots they did not know existed.

Legacy debt. A custom Azure Function built from Splunk’s open-source azure-functions-splunk repository. Circa 2017-2018. The data is intact. The sourcetype is non-standard. No one has revisited this infrastructure in years.

Macro misalignment. This one caught us off guard. Splunk’s own ESCU content uses two different macros for Entra sign-in data. azure_monitor_aad for device code detection. azuread for password spray detections. These resolve to different sourcetypes. A client with correctly configured Event Hub ingestion under azure:monitor:aad can still have partial ESCU coverage if the azuread macro does not resolve to their data. Most teams deploy ESCU content without inspecting the underlying macro definitions. We did not either, until a hunt revealed a password spray gap in an environment where device code detection was working perfectly.

The investigation trail you are also missing

Detecting the deviceCode protocol value is the first layer. Without it, you cannot see this attack class at all. With it, you can see device code authentications, but you cannot yet distinguish them from those of someone logging into a conference room display. Device code phishing generates two distinct sign-in events that defenders need to pair.

The victim’s browser-based code redemption is an interactive sign-in. The attacker’s polling token issuance is a non-interactive sign-in that lands in NonInteractiveUserSignInLogs and is what most atomic detections miss. The two events share a sessionId and an originalRequestId but originate from completely different network locations. The victim’s event will show their corporate IP address and browser user agent. The attacker’s event will show a cloud hosting provider IP, often Railway or Cloudflare Workers, and a user agent string that matches nothing else the victim has produced. Combining these two events into a single session and comparing IP and user agent across the pair provides the highest-fidelity behavioral signal for this attack class.

Filtering on the hosting provider the attacker uses today is a detection that expires the moment they switch providers. The signal that does not expire is the session itself. The victim enters the code from their corporate network or their home IP address. The attacker’s polling server is on Railway, Datacamp, Interserver, or wherever they rented infrastructure to support their campaign, and it receives the token from rented infrastructure elsewhere. Both events share the same sessionId, but the IP address and user agent differ. The attacker can change which hosting provider they use, but they cannot eliminate the fact that their location differs from the victim’s. The protocol requires it.

NonInteractiveUserSignInLogs is a separate Diagnostic Settings category that many tenants do not export. It generates 5 to 10 times as many interactive sign-in logs, and cost-conscious deployments often skip it. Without it, the attacker’s leg of the phish is invisible. The atomic detection still fires on deviceCode, but the session-correlation hunt loses the necessary data.

Checking whether NonInteractive sign-in logs are flowing is the third verification step that separates detection deployment from detection assurance.

How to check your own environment

The audit is straightforward. No specialized tooling required.

Step 1. Identify your sourcetype. Run a metadata search in Splunk for any sourcetype containing azure, aad, signin, entra, o365, or microsoft. This tells you what Entra sign-in data you have and what label it carries.

Step 2. Determine your source. For each sourcetype found, check whether the raw events contain the nested properties.authenticationProtocol path (Event Hub origin) or the flat top-level authenticationProtocol path (Graph origin). If neither returns values, you are on a v1.0 Graph path, and the field is not there.

Step 3. Verify detection alignment. Confirm your detection rules target the sourcetype your data actually lives under. If you are using ESCU content, inspect the azure_monitor_aad and azuread macro definitions. If those macros do not resolve to your sourcetype, the detections do not fire.

Step 4. Check NonInteractive coverage. Search for category=NonInteractiveUserSignInLogs in your Entra sign-in data. If it returns nothing, you are missing the attacker’s half of the device code phishing exchange.

Remediation depends on the gap type. A default sourcetype override is a five-minute fix. The absence of the Diagnostic Settings category is a configuration change in Azure. A v1.0 Graph source with no Event Hub is an infrastructure project.

Detection alone does not stop this technique

Everything above addresses whether you can see device code authentication in your environment. Seeing it is not the same as stopping it.

Device code phishing bypasses every form of MFA, including passkeys. Authentication occurs on a legitimate Microsoft login page, in the victim’s own browser, using their own credentials and second factor. The device code authorization occurs post-authentication. There is nothing for phishing-resistant authentication to resist because the user is authenticating legitimately. The abuse happens after.

The preventive control is a Conditional Access policy with the Authentication Flows condition. Available in Entra ID, this condition allows you to block device code flow tenant-wide. Any authentication attempt that uses deviceCode as the protocol is denied at the identity plane before a token is issued. No token issued, no token to steal.

Implementation is straightforward. Create a CA policy that targets all users and all cloud apps, with the Authentication Flows condition set to block the device code flow. Then add an exclusion group for legitimate use cases. Conference room displays, shared kiosks, IoT devices, and CLI tooling that genuinely require this flow. The exclusion group should be small, documented, and reviewed quarterly.

Microsoft has been auto-deploying this as a managed Conditional Access policy since February 2025, initially in report-only mode. Most tenants already have it. Report-only does not block anything. Until an administrator reviews the impact and switches the policy to On, the flow remains open. There is already at least one documented case of a tenant with the managed policy in place for nearly a year that still logged a successful device code authentication, the sadmin could not explain. Policy deployment is not policy enforcement. Detection is how you verify the difference.

This policy is not yet widely deployed. Push Security’s research suggests fewer than 0.3% of in-the-wild device code attempts are blocked at the policy layer. The vast majority reach the detection layer without being blocked. As the gap categories above demonstrate, most of those detections are not firing either.

The defense-in-depth stack for device code phishing is four layers.

Prevent. CA policy blocking device code flow except for an approved allowlist. This is the strongest control and should be implemented first. It stops the technique before there is anything to detect.

Detect. The authenticationProtocol=deviceCode field in Entra sign-in logs, delivered through a pipeline that preserves the field and under a sourcetype, your rules can reach. Fix the pipeline, verify the field, confirm the rule fires.

Investigate. Non-interactive sign-in logs for the attacker’s polling leg. Session correlation by sessionId and originalRequestId to pair victim and attacker events. IP and user agent mismatch analysis within the same session. The behavioral layer that catches what the atomic detection misses.

Respond. Token revocation for the compromised account, session kill across all active sessions, and review of any inbox rules, app consents, or permission changes made during the compromised window. One step many runbooks omit: check whether the attacker registered a device. Storm-2372’s February 2025 shift to the Microsoft Authentication Broker client ID enabled them to chain device code into device registration and PRT acquisition. A compromised account with an attacker-registered device cannot be remediated solely through token revocation. The device must be deregistered; the attacker holds a Primary Refresh Token that renews access across all Entra-connected applications.

Each layer assumes the one before it. Without Prevent, every attempt reaches Detect. Without a working Detect pipeline, every attempt reaches Investigate. Without NonInteractive logs, every attempt reaches Respond, and only after damage is done.

Fix the pipeline. Block the flow. Both.

The gap between the rules exists, and the rule works here

Device code phishing detection content is mature. Microsoft, Splunk, and the community have published solid rules. None of that matters if the ingestion pipeline does not deliver the field the rule matches on, under a sourcetype the rule knows to search.

The audit takes minutes. The alternative is discovering the gap during an incident, when a compromised account is exfiltrating email using a stolen token, and your device code detection returns no results. Not because the attack did not happen, but because the data took a path your rule never learned to follow.

Deploying a detection is step one. Confirming it fires against the data your pipeline actually delivers is the step most teams skip. That gap is where adversaries operate. We proactively close it across every environment we protect before a campaign can expose it. That is what separates detection deployment from detection assurance.

References

Microsoft Threat Intelligence Center. (2025, February 13). Storm-2372 conducts device code phishing campaign. Microsoft Security Blog. https://www.microsoft.com/en-us/security/blog/2025/02/13/storm-2372-conducts-device-code-phishing-campaign/
Microsoft Threat Intelligence. (2025, May 29). Defending against evolving identity attack techniques. Microsoft Security Blog. https://www.microsoft.com/en-us/security/blog/2025/05/29/defending-against-evolving-identity-attack-techniques/
Microsoft Threat Intelligence. (2026, April 6). Inside an AI-enabled device code phishing campaign. Microsoft Security Blog. https://www.microsoft.com/en-us/security/blog/2026/04/06/ai-enabled-device-code-phishing-campaign-april-2026/
Lakshmanan, R. (2025, December 30). Russia-linked hackers use Microsoft 365 device code phishing for account takeovers. The Hacker News. https://thehackernews.com/2025/12/russia-linked-hackers-use-microsoft-365.html
Lakshmanan, R. (2026, March). Device code phishing hits 340+ Microsoft 365 orgs across five countries via OAuth abuse. The Hacker News. https://thehackernews.com/2026/03/device-code-phishing-hits-340-microsoft.html
Sheridan, K. (2025, December 18). OAuth device code phishing campaigns surge targets Microsoft 365. Infosecurity Magazine. https://www.infosecurity-magazine.com/news/oauth-phishing-campaigns/
Push Security. (2026, April). Analyzing the rise in device code phishing attacks in 2026. https://pushsecurity.com/blog/device-code-phishing
Microsoft. (n.d.). Authentication flows as a condition in Conditional Access policy. Microsoft Learn. https://learn.microsoft.com/en-us/entra/identity/conditional-access/concept-authentication-flows

Certis Foster, Sr. Threat Hunter

Certis Foster has spent over a decade in cybersecurity, starting in United States Air Force Cyber Operations before moving into enterprise infrastructure, MSSP, and MDR roles. Today, as Senior Threat Hunter Lead at Deepwatch, he turns hunt methodologies into enterprise-scale global detections in partnership with technical teams worldwide.

Read Posts

Move from reactive operations to proactive containment

Cultivating Cyber Resilience Experts

Get access to all the critical information you need to be successful.

Empowering CISOs: Seven Strategies to Outmaneuver Threats for Organizational Resilience

Security Insights

Your Device Code Phishing Detection Probably Doesn’t Work.

By Certis Foster, Sr. Threat Hunter