Daily notes on AI, testing, and building software.
A new April 2026 arxiv paper demonstrates that pairing LLMs with Retrieval Augmented Generation (RAG) pipelines significantly reduces hallucination in AI-generated test cases, meaning the test suites LLMs write are more…
Claude Opus 4.8's new Dynamic Workflows feature lets a single orchestrator agent spin up and coordinate up to 1,000 parallel subagents in one session — a capability that transforms how large-scale test execution,…
CVE-2026-20127 is a maximum-severity (CVSS 10.0) authentication bypass in Cisco Catalyst SD-WAN Controller and Manager that allows an unauthenticated remote attacker to gain high-privileged administrative access —…
On May 11, 2026, attackers published 84 malicious versions across 42 @tanstack/ npm packages in a six-minute window by chaining three separate GitHub Actions vulnerabilities: a pullrequesttarget Pwn Request,…
CVE-2026-0257 is an authentication bypass vulnerability in the GlobalProtect portal and gateway of Palo Alto Networks PAN-OS software that allows unauthenticated remote attackers to forge authentication override cookies…
AI agents are being deployed into production pipelines at scale, but most teams have no systematic way to test their safety and security properties — Microsoft's RAMPART fills that gap with a framework QA engineers…
Microsoft's newly open-sourced RAMPART framework brings red team-style safety and security testing directly into the pytest workflow, meaning QA engineers can now write standard test files that evaluate agentic AI…
CVE-2025-59528 is a maximum-severity (CVSS 10.0) unauthenticated remote code execution vulnerability in FlowiseAI Flowise, a popular open-source AI agent and chatflow builder. By sending a crafted POST request to the…
Claude Opus 4.8 — released May 28, 2026 — ships a new "dynamic workflows" feature that can orchestrate up to 1,000 parallel subagents on a single task, and Anthropic has already demonstrated it completing a 750,000-line…
Claude Opus 4.8's dynamic workflows can spawn hundreds of parallel sub-agents that execute your existing test suite as the quality bar for massive codebase migrations — turning test automation from a gating step into an…