Back to Cotool Book a demo

AI Research in Security Operations

Pushing the frontier of AI agents for real security work

Benchmarks

Windows Enterprise Intrusion

Windows Enterprise Intrusion

Windows Enterprise Intrusion

BlueBench-Intrusion-002: Real multi-host Windows Active Directory intrusion spanning detection engineering, malware analysis, and open-ended incident reporting

Detection Engineering

Malware Analysis

Incident Response

Lateral Movement

Credential Access

Memory Forensics

40 samples · 22 models · Jul 2026

macOS Threat Investigation

macOS Threat Investigation

macOS Threat Investigation

BlueBench-Intrusion-001: Real macOS infostealer intrusion spanning incident response, threat hunting, and detection engineering

Incident Response

Detection Engineering

macOS Forensics

Credential Access

Data Exfiltration

36 samples · 9 models · Mar 2026

Real CTF challenges from CSAW competitions covering reverse engineering, forensics, and miscellaneous problem-solving

Reverse Engineering

81 samples · 11 models · Feb 2026

Cybench (Defensive Subset)

Cybench (Defensive Subset)

Cybench (Defensive Subset)

Defensive security CTF challenges testing forensics, reverse engineering, and miscellaneous security skills

Reverse Engineering

18 samples · 10 models · Jan 2026

BOTSv3 Blue Team CTF

BOTSv3 Blue Team CTF

BOTSv3 Blue Team CTF

Blue team CTF scenarios testing incident response and threat hunting

Incident Response

Advanced Peristent Threat (APT)

Cloud Security (AWS/Azure)

51 samples · 15 models · Dec 2025

Sigma Detection Classification

Sigma Detection Classification

Sigma Detection Classification

Multi-label classification of MITRE ATT&CK tactics and techniques from Sigma rules

Detection Engineering

2733 samples · 12 models · Jan 2026

Multiple-choice cybersecurity knowledge evaluation across 10,000 questions

Standards & Certifications

Network Security

Risk Management

Incident Response

Application Security

10180 samples · 13 models · Feb 2026

AI for the blue team.

Scale detection, response, and threat hunting beyond headcount. Build AI agents across your entire security stack.