🧑💻 Claude Beats 10,000 Students in Hacking Tournament
Anthropic Claude Sonnet 3.7 recently competed in PicoCTF 2025, a student hacking competition. Contestants solved challenges in system exploitation, cryptography, reverse engineering, and vulnerability discovery, all with the goal of capturing hidden codes—known as "flags" (hence the name Capture the Flag).
Claude's entry was almost accidental. Out of curiosity, Anthropic researcher Kian Lucas fed the AI its first challenge, and it instantly provided the correct solution. "What if we just keep going?" he thought, and ran Claude in autonomous mode for the rest of the tournament.
The outcome was better than anyone expected: Claude solved 32 out of 41 tasks, landing in the top 3% worldwide and placing 297th out of nearly 10,500 participants.
In professional-level cybersecurity events like PlaidCTF or DEF CON, Anthropic reports that the AI couldn't solve a single challenge. During longer competitions, Claude sometimes experiences "memory" issues, leading him to lose focus and drift into philosophical tangents rather than continuing with security tasks.
Despite these limitations, some AI agents are achieving notable successes in related areas. For example, from April to June, the Xbow agent held the top spot on HackerOne leaderboard for discovering critical vulnerabilities.
#news #Claude @hiaimediaen