When you purchase through links on our site, we may earn an affiliate commission. Here’s how it works.

Microsoft unveils MDASH, a multi-model agentic AI system that beats Anthropic's Mythos

Microsoft's Autonomous Code Security team has unveiled MDASH (Multi-Model Agentic Scanning Harness), a sophisticated multi-model security system.
Microsoft Security

OpenAI recently introduced Daybreak, a cybersecurity initiative that combines its models with Codex to help defenders with secure code review, threat modeling, patch validation, dependency risk analysis, and remediation guidance. Anthropic also announced a similar initiative with Claude Security and Project Glasswing, which focus on scanning codebases, validating findings, and suggesting patches for human review.

Responding to the growing competition from AI labs, Microsoft today announced a new multi-model agentic security system that helped its researchers find 16 new vulnerabilities across Windows networking and authentication components, including four critical remote code execution flaws.

The system is codenamed MDASH (Multi-Model Agentic Scanning Harness), and it was developed by Microsoft’s Autonomous Code Security team. Instead of relying on a single AI model for vulnerability scanning, this new system uses more than 100 specialized AI agents across different frontier and distilled models. These agents work across stages such as code preparation, scanning, validation, deduplication, proof generation, and patch validation. The system uses heavier models for reasoning and smaller distilled models for high-volume debate and validation tasks.

Microsoft highlighted that this new system performs better than the single-model systems that are already out there. On a private test driver containing 21 deliberately planted vulnerabilities, MDASH found all 21 with zero false positives. In retrospective testing, it achieved 96% recall on five years of confirmed MSRC cases in clfs.sys and 100% recall on cases in tcpip.sys. On the public CyberGym benchmark, which includes 1,507 real-world vulnerability reproduction tasks, Microsoft says MDASH scored 88.45%, placing it at the top of the leaderboard, even beating Anthropic's Mythos model and OpenAI's GPT-5.5.

The new MDASH system is already helping Microsoft's internal engineering teams to improve the security posture of several products and services. Microsoft also highlighted that it is being tested by customers as part of a limited private preview. Interested customers can join the private preview of MDASH here.

Logo of the Electronic Frontier Foundation
Next Article

EFF slams Meta for killing Instagram encrypted chats and blaming users

Windows 11 logo
Previous Article

Microsoft testing native Windows 11 app performance with impressive early results

2 Comments

Load the comments and join the conversation!

Read the comments, ask the editors questions, show respect and join the conversation.

Click here