Testing Skyscrapers, AI Drift, Playwright Agents That Promise to Do It All TGNS171

14 October 2025, 01:59 AM

By Test Guild

About This Episode:

Is the Testing Pyramid holding your team back?

AI agents are writing, planning, and even fixing your Playwright tests automaticall, but do they actually save time, or just add complexity?

And why are your AI agent tests passing today but mysteriously failing tomorrow?

Find out in this episode of the Test Guild New Shows for the week of Oct 12th. So, grab your favorite cup of coffee or tea, and let's do this.

Exclusive Sponsor

Discover ZAPTEST.AI, the AI-powered platform revolutionizing testing and automation. With Plan Studio, streamline test case management by directly importing from your common ALM component into Plan Studio and leveraging AI to optimize cases into reusable, automation-ready modules. Generate actionable insights instantly with built-in snapshots and reports. Powered by Copilot, ZAPTEST.AI automates script generation, manages object repositories, and eliminates repetitive tasks, enabling teams to focus on strategic goals. Experience risk-free innovation with a 6-month No-Risk Proof of Concept, ensuring measurable ROI before commitment. Simplify, optimize, and automate your testing process with ZAPTEST.AI.

Start your test automation journey today—schedule your demo now! https://testguild.me/ZAPTESTNEWS

Links to News Mentioned in this Episode

Time	News	Link
0:22	ZAPTESTAI	https://testguild.me/ZAPTESTNEWS
1:01	Test Skyscraper	https://testguild.me/ectr3a
2:54	Playwright Agent mode	https://testguild.me/vmhtm4
4:12	Fizz Bee	https://fizzbee.io/testing/
5:11	TestWheel Plugin	https://testguild.me/clq6oy
6:02	Observability 3 Pillars	https://testguild.me/fzq9de
7:01	Webinar of the Week	https://testguild.me/gs2u5n
7:59	Testing Agentic Drift	https://testguild.me/uagg5y

News

Transcript

Download New Tab

[00:00:00] Joe Colantonio Is the testing pyramid holding your team back? AI agents are writing, planning, and even fixing your Playwright tests automatically. But do they actually save time or just add complexity? And why are your AI agents tests passing today, but mysteriously failing tomorrow? Find out in this episode of the Test Guild News Show for the week of October 12th. Grab your favorite cup of coffee or tea and let's do this.

[00:00:22] Joe Colantonio Hey, before we get into the news, I want to thank this week's sponsor Zaptest AI, an AI driven platform that can help you supercharge your automation efforts. It's really cool because their intelligent co-pilot generates optimized code snippets while their planned studio can help you effortlessly streamline your test case management. And what's even better is you can experience the power of AI in action with their risk-free six-month proof of concept featuring a dedicated ZAP expert at no upfront cost. Unlock unparallel efficiency and ROI in your testing process. Don't wait. Schedule your demo now and see how it can help you improve your test automation efforts using the link down below.

[00:01:01] Joe Colantonio All right. So let's start with something that might make you rethink your entire testing strategy. What is it? Let's check it out. This by Andy Knight writing on Automation Panda, who argues that the testing pyramid is an outdated model that no longer reflects modern testing capabilities. Andy, who previously wrote in support of the pyramid, now calls it an antiquated scheme that deceives testers. The testing pyramid originated when UI testing was difficult due to out-standardized browsers requiring testers to build their own frameworks around Selenium WebDriver. Test execution was slow and flaky, leading teams to label UI tests as bad and favor unit tests for their speed, reliability, and measurable code coverage. Andy, though, identifies three major changes since the pyramid's inception. UI testing tools, including Playwright and Cypress, now provide greater stability through automatic waiting and faster execution, while Selenium continues improving with the BiDi protocol. He also goes over how traditional API testing can be replaced by internal unit tests for domain logic, contract tests for service handshakes, and UI tests for end-to-end validation and test orchestration now enables continuous test execution for every code change, pull requests and even local end- to-end tests before commits, making fast feedback an important area that is more valuable than test type quotas. And he proposes the testing skyscraper as a replacement model. Unlike pyramids, this narrow towards the top skyscrapers have multiple levels of varying sizes where each floor serves different tenant needs. This allows testers to architect strategies, meaning the specific requirements, build tests at any level deemed necessary for business needs, skip testing levels as calculated risk creates empty floors until needed change and use modern testing tools to scale strategies upwards and onwards. Another great read by Andy. You can check it out using the link down below.

[00:02:55] Joe Colantonio All right. Speaking of modern approaches, if you're using Playwright, you need to know these 3 new AI agents that promises to automate your entire testing workflow. All right, this topic has been covered by a few people. First one is Karthik, who goes over the 3 new key agents and also Kailash, who blogged about this a few days ago as well. And they all talk about how Playwright has released documentation for three AI agents to automate test creation and maintenance, planner, generate, and healer. And these agents can work independently, sequentially or in a chained agentic loop to produce test coverage. It goes over all the different agents. For example, the Planner agent explores applications and produces test plans for scenarios and user flows. And the output is a markdown test plan saved to the spec directory that is human readable, but precise enough for test generation. And that's where the Generator agent comes in, which transforms this markdown into executable Playwright test. And once the generator agent is done, the Healer agent handles failing tests by replaying the failing steps, expect the current UIs to locate equivalent elements or flows and suggest patches such as locator updates, weight adjustments or data fixes. And I think this approach in these three agents kind of addresses what Andy was talking about with the testing skyscraper using modern tools to help you with modern testing. And this can help you as well.

[00:04:12] Joe Colantonio All right. Staying with this AI theme, here's a new tool tracking one of the hottest problems in testing and currency bugs. I found this on my LinkedIn feed by Jay, who talks about this new Fizzbee Autonomous testing tool. And this is the model based testing solution. And one of the reasons why he created this, he talks about how even with AI generated code using tools like Copilot to write tests remains insufficient because testers must still think through scenarios and review generated code. By using FizzBee Autonomist testing requires testers to describe the design in a Python like language to show how their code implements that design. The tool handles the rest, including testing for concurrency issues, this tool supposedly claims no other autonomous testing solution addresses. This tool targets distributed databases and storage systems, APIs and microservices with evolving state concurrent services where operations order matters and stateful systems too complex or handwritten test suites.

[00:05:12] Joe Colantonio All right, I'm not sure how many people use this tool, but if you do, just so you know, TestWheel has just released a new plugin. They released a Jenkins plugin designed to simplify test execution within your CI/CD pipelines. This plugin allows developers and QA teams to orchestrate and execute test suites across multiple environments and device labs directly from Jenkins, reducing typical set up time from hours to minutes. Michael Morello, marketing manager at Testwheel, states that the plugin addresses frustrations with minimal configurations, fragile tests and incompatible tools. The plugin includes four main features, one click integration with Jenkins jobs. Intelligent test filtering using AI to automatically select the run only tests affected by recent code changes. The plugin also handles environment variables, device configuration and secret credentials between Jenkins and TestWheel.

[00:06:01] Joe Colantonio All right, so let's shift gears to strategy. This next post is all about observability, the 3 pillars for quality engineers by Aya, a guide explaining how observability helps quality engineers answer critical questions about system performance and production incidents. This article breaks down observability into 3 components, logs, metrics, and traces. It goes into detail what are logs and how quality engineers can use logs and test automation to document execution details in performance testing to track resource usage and in integration testing to monitor API calls between services. You also go over metrics and what you should measure to help with performance benchmarking, SLA monitoring, capacity planning, and setting automated quality thresholds. And finally, they go over traces and how traces can help end-to-end testing in microservices systems, identifying performance bottlenecks and mapping service dependencies and pinpointing failure locators. And the article goes on to demonstrate the three pillars, how they work together using an e-commerce checkout scenario.

[00:07:02] Joe Colantonio All right. Next up is our webinar of the week. If you have anything to do performance, you definitely need to join us on this webinar with Perforce, with our experts from BlazeMeter, Shane Evans and also the one and only Scott Moore. We're going to cover how to shift performance testing left and integrate it into your CI/CD pipelines. This session is going to center on helping teams ensure that applications can handle production level traffic and demands, which you should always pay attention to, but especially as we go into the holidays. In the webinar, we're going to cover how BlazeMeter can be used to simulate realistic user traffic from multiple global locations, validate performance across different testing protocols and integrate continuous testing into your CI/CD pipelines. Scott and Shane will demonstrate strategies from shifting performance testing earlier in development cycles. Improving application stability and the load and identifying bottlenecks before they affect your users. Highly recommend you register even if you can't make the session because I'll send the link to the recording after the fact. So don't miss it. Register now using the link down below. Hope to see you there.

[00:08:00] Joe Colantonio Next up is about agentic drift. So here's something that I think should be on every test's radar. IBM is warning about a phenomenon called agentic Drift that can silently degrade your AI agents over time. And this one's crucial if you're testing AI powered systems. This article talks about the testing challenge with AI agents called agentic drift, which occurs when AI agents change behavior as underlining models up the training data shifts or business context evolves. An agent that performs correctly today might produce degraded or incorrect responses tomorrow, creating problems for traditional software testing methods built on deterministic logic. So IBM solution is the IBM agentic testing framework. This framework uses large language models to evaluate agent responses against natural language expectations rather than exact shrinks.

[00:08:48] All right, for links of everything of value we covered in this news episode, head on over to the links in the first comment down below. So that's it for this episode of the Test Guild News Show. I'm Joe. My mission is to help you succeed in creating end-to-end full stack pipeline automation awesomeness. As always, test everything and keep the good. Cheers.

Scroll back to top

AI Testing LLMs & RAG: What Testers Must Validate with Imran Ali

Posted on 12/21/2025

About This Episode: AI is transforming how software is built, but testing AI ...

AI Codebase Discovery for Testers with Ben Fellows

Posted on 12/14/2025

About This Episode: What if understanding your codebase was no longer a blocker ...

Automation Guild Kickoff, AWS Nova, Karate v2, and AI Testing Reality and more TGNS177

Posted on 12/08/2025

About This Episode: Which long-running testing event just announced registration for its 10th ...