Daily notes on AI, testing, and building software.
CVE-2025-60710 is a high-severity local privilege escalation vulnerability in Windows' Host Process for Tasks (taskhostw.exe) caused by improper symbolic link resolution before file deletion. An authenticated…
OpenAI's GPT-5.2-Codex is the most capable agentic coding model to date, achieving state-of-the-art scores on SWE-Bench Pro and Terminal-Bench 2.0 — benchmarks that directly simulate real-world software engineering and…
As LLM-powered features ship at record speed across every product category, most engineering teams lack any formal test strategy for their AI components — leaving quality, safety, and reliability entirely to chance.…
A new category of open-source, LLM-native test automation libraries — led by tools like Alumnium — is replacing the era of fragile CSS selectors and page object boilerplate with human-readable assertions that AI…
OpenAI's GPT-5.5 — released April 23, 2026 — combined with an accelerating wave of agentic testing frameworks means QA teams are no longer just writing automated tests: they're orchestrating AI agents that reason about,…
OpenAI's GPT-5.5 arrived on April 23, 2026 with dramatically improved code writing, debugging, and autonomous task completion capabilities — and a 1M token context window — making it the most capable model yet for…
OpenAI's GPT-5.5 dramatically raises the bar for AI-assisted code generation — including test code — but its tendency to follow instructions "too literally" and produce high volumes of low-maintainability output means…
OpenAI's GPT-5.5 just scored 82.7% on Terminal-Bench 2.0 and 58.6% on SWE-Bench Pro — benchmarks that measure real-world software engineering tasks — signalling that AI agents are now capable of owning multi-step test…
OpenAI's GPT-5.5 brings native agentic computer use — the ability to see a screen, click, type, and navigate interfaces — directly into the hands of developers and QA engineers, blurring the line between AI assistant…
CVE-2026-3844 is a critical unauthenticated arbitrary file upload vulnerability (CVSS 9.8) in the Cloudways Breeze Cache plugin for WordPress, affecting all versions up to and including 2.4.4. The flaw allows any…