Daily notes on AI, testing, and building software.
CVE-2026-33824 is a Critical (CVSS 9.8) unauthenticated remote code execution vulnerability in Windows Internet Key Exchange (IKE) Service Extensions, disclosed and patched by Microsoft on April 14, 2026 as part of…
CVE-2025-29635 is a critical command injection vulnerability in end-of-life D-Link DIR-823X routers that allows unauthenticated attackers to execute arbitrary OS commands with root privileges via a crafted POST request…
Anthropic's Claude Mythos Preview has autonomously discovered thousands of previously unknown (zero-day) vulnerabilities across every major operating system and web browser — proving that AI has crossed a threshold…
Anthropic's new Claude Managed Agents service handles the full execution harness for running AI agents autonomously — including file access, code execution, and web browsing — making it dramatically easier to build…
Anthropic's Claude Managed Agents, launched in public beta on April 8, 2026, give teams a production-grade infrastructure for running Claude as an autonomous agent with secure sandboxing, built-in tools, and full…
A financial company eliminated its entire QA team in early 2026 and replaced them with an AI testing pipeline — then lost $6M when the system hallucinated a discount code that priced everything in their store at $0. The…
Three chained vulnerabilities in SimpleHelp Remote Monitoring and Management (RMM) software — CVE-2024-57726, CVE-2024-57727, and CVE-2024-57728 — are being actively exploited by the Medusa and DragonForce…
CVE-2026-21571 is a critical OS Command Injection vulnerability (CVSS 9.4) in Atlassian Bamboo Data Center and Server that allows an authenticated attacker with low-level privileges to execute arbitrary operating system…
A new paradigm called Test-Oriented Programming (TOP) — published on ArXiv this month — inverts the traditional dev/QA relationship: AI writes the code, but humans own the tests. For QA professionals, this is less a…
As AI-powered features ship into production apps, QA teams urgently need better tools for testing LLM behavior — STELLAR, a new ArXiv-published framework, exposes up to 4.3x more LLM failures than baseline approaches by…