The Illusion of Testing, Jarvis Appium, OpenAI CUA, and more TGNS161

By Test Guild
  • Share:
Join the Guild for FREE
Test-Guild-News-Show-Automation-DevOps

About This Episode:

The Illusion of Testing, Jarvis Appium, OpenAI CUA, and more

Is AI-powered testing just an illusion?

What is Jarvis Appium?

How can you use OpenAI's Computer-Using Agent to create a front-end testing agent?

Find out in this episode of the Test Guild New Shows for the week of June 22.  So, grab your favorite cup of coffee or tea, and let's do this.

Support the show and see AI Testing in action Now: https://testguild.me/ZAPTESTNEWS

Exclusive Sponsor

Discover ZAPTEST.AI, the AI-powered platform revolutionizing testing and automation. With Plan Studio, streamline test case management by directly importing from your common ALM component into Plan Studio and leveraging AI to optimize cases into reusable, automation-ready modules. Generate actionable insights instantly with built-in snapshots and reports. Powered by Copilot, ZAPTEST.AI automates script generation, manages object repositories, and eliminates repetitive tasks, enabling teams to focus on strategic goals. Experience risk-free innovation with a 6-month No-Risk Proof of Concept, ensuring measurable ROI before commitment. Simplify, optimize, and automate your testing process with ZAPTEST.AI.

Start your test automation journey today—schedule your demo now! https://testguild.me/ZAPTESTNEWS

Links to News Mentioned in this Episode

0:15 ZAPTESTAI https://testguild.me/ZAPTESTNEWS
0:54 Testing Trap https://testguild.me/wymhla
2:14 Illusion of AI https://testguild.me/lz162y
4:11 OpenAI CUA Test Demo https://testguild.me/eph7i8
5:08 SerenityJS https://testguild.me/lz162y
6:02 AI opensource framework https://testguild.me/6wm4zs
7:08 Jarvis Appium https://testguild.me/4oi99c
8:25 Typescript Doc QA https://testguild.me/9jxtlf
9:07 Playwright AXE Reporter https://testguild.me/f8pskh

News

[00:00:00] Are you falling into this testing trap? Is AI power testing just an illusion? What is Java's Appium? Find out in this episode of the Test Guild News show for the week of June 22nd. Grab your favorite cup of coffee and tea and let's do this.

[00:00:49] Hey, before we get into the news, I want to thank this week's sponsor Zaptest AI, an AI driven platform that can help you supercharge your automation efforts. It's really cool because their intelligent co-pilot generates optimized code snippets while their planned studio can help you effortlessly streamline your test case management. And what's even better is you can experience the power of AI in action with their risk-free six-month proof of concept featuring a dedicated ZAP expert at no upfront cost. Unlock unparallel efficiency and ROI in your testing process. Don't wait. Schedule your demo now and see how it can help you improve your test automation efforts using the link down below.it can help you improve your test automation efforts using the link down below.

[00:00:54] Joe Colantonio All right, first up. Brijesh, a testing expert, examines a common trend in software testing when compelling marketing narratives like automation is fast and modern, and manual testing the slow is outdated, shifts from influencing choices to driving overall testing strategy. And he breaks down this article of what happens when businesses adopt tools and platforms largely on buzz rather than fit. And he also goes over that while automation brought speed and regression testing. It also introduced maintenance burdens, bloated suites, flaky tests, false positives, and unchecked risk areas. Issue lays in making automation a goal, not a tool. And he also includes a real example from a fintech startup. And he, also flags the rise of low code, no code tools that promises to democratize testing. And he thinks that these further the narrative that anyone could test with minimal skill and goes of how this furthers the narrative, that anyone could has been minimal skill. But risk masking are testing decisions behind simplified interfaces. So are you falling into these traps? I think software testers should scrutinize automation narratives like these and select tools that align with their mission critical risk, not just those, but the flashiest marketing and something you need to be even more aware of now in the age of AI. And you can read more about this in the links down below.

[00:02:14] Joe Colantonio Speaking of traps, another one is thinking that AI is really thinking when it's doing automation or testing. And here's a deep dive from research from Apple that talks all about this important research. And this is all about how the research team as Apple has published a study critically evaluating the capabilities of frontier large reasoning models. And it goes over how the study introduced a controlled puzzle based framework to test reasoning abilities under increased levels of complexity, offering new insights into how these models process logic beyond just first answers. And rather than relying on math benchmarks, which are often contaminated with training data and difficult to control, the researchers used four puzzle environments, the Tower of Hanoi, checker jumping, river crossing and Blocks World to analyze reasoning performance and trace accuracy. The framework allowed them to measure not just the outcome, but the quality and evaluation of the model's intermediate reasoning steps. What do they find? Well, for simple problems, AI is good. Regular AI without thinking mode often does better on easy problems. For medium problems, thinking AI helps. AI shows that thought process can be more helpful here. The hard problems, though, both fail. Once the puzzles get harder, both types of AI break down and they stop thinking properly and give up even though they still had space to write more answers. It also highlights some weird behaviors that like sometimes AI finds the right It's early, but keeps guessing and messing up. And even when the right steps are given to the AI, it still messes up following them. So I think this highlights that if you want to create tests that just use AI to make decisions to solve problems, you can't trust that the AI is really reasoning. It might look smart, but it could be faking it or mimicking patterns. So something you definitely, definitely need to be aware of. And you can find out more about the study and read the whole thing down below.

[00:04:11] Joe Colantonio Now, after all that being said, let's check out some AI. OpenAI has just published OpenAI testing agent demo, a public model repo showcasing a front end testing agent powered by the OpenAI computer using agent model and the response API. And this demo demonstrates how the agent orchestrated by a node server uses Playwright to launch a browser, navigate a simple e-commerce site and executes automated test cases defined via a Next.js front end. I think this demo gives testers a live example of how to use AI driven UI Testing and how to integrate model controlled actions with Playwright in a configurable front end environment and also showcase a potential shift towards agents that follow written test scripts and manipulate interfaces autonomously. But it remains experimental. And as we've seen in the previous result, it's not really reasoning. So you can't really trust this. So if you're going to go down this road, make sure you're always testing the AI test.

[00:05:08] Joe Colantonio All right. One of my favorite frameworks back in the day, Serenity has just made a new release and Jan announced this on LinkedIn, the release of Serenity JS version 3.32, and this release introduces two key changes aimed at approving Playwright test integration. Some of the key updates is the work of scope actors and before all, after all hooks, you can now use the actor called within those hooks to initiate actors once per test worker. If you don't know what actors are, this is something you should definitely take a deep dive on as well. It also goes over the support setup and tear down tasks like starting services or validating API statuses shared across all testers in a worker as a cleaner and richer reporting environment and a bunch of other things as well. So if using Serenity JS, definitely a worthwhile way to upgrade. And if you haven't checked out Actors, definitely something you should research more as well, and they have that all in the Serenity documentation also.

[00:06:03] Joe Colantonio All right. This next one also caught my attention on LinkedIn. I'm terrible with names, but software development engineer and test Venkatesh recently published a LinkedIn post showcasing a custom built full stack automation framework that integrates GitHub co-pilot, Playwright, MCP server, and TraceViewer to streamline end-to-end testing. And he actually uploaded to GitHub where you can try it for yourself. And this framework, automate scripting, test execution, results visualization, and trace login across both the UI, API layers. And it also goes over how the use of GitHub Copilot supports AI assisted script generation, while Playwright handles automation across web user interfaces and APIs. And the MCP server is used for centralized orchestration and execution. And the Trace Viewer enables detailed debugging via visual test traces. So if you're a tester and you want to start dabbling in testing, exploring AI accelerated automation, to check out how tools like GitHub Copilot and Playwright can be integrated in orchestrated using MCP Server, you definitely probably want to check out this GitHub resource using the link down below.

[00:07:08] Joe Colantonio All right. I found another post by Sai on his LinkedIn profile. This is probably the fourth time in four weeks. I featured something that they've released recently on LinkedIn. And this is a new tool designed to simplify mobile test automation using natural language, and it's been open sourced on GitHub. In this project, Java's Appium was jointly developed by Sudharsan, Srini, and Sai. And it offers an MCP server that connects Appium test automation with AI powered capabilities. They also linked to a video demo, but if you go to the GitHub, you can see how it allows testers to describe mobile test scenarios in plain English, which are then converted into executable Java or TestNG test code. This tool also includes A.I. Native element detection to optimize located strategies and is compatible with both Android, UI Automated 2, and iOS XCUI tests. The key features of the tool is the integration with Goose, which I've been hearing a whole bunch about. I definitely have to get Angie Jones on the show to talk about it on the podcast. But anyway, if you don't know, it's a local AI Agent used to coordinate workflows and support autonomous testing. And Java's handles test generation and execution while Goose manages orchestration. Forming a combined system intended to reduce minimal efforts in mobile QI processing.

[00:08:25] Joe Colantonio Alright. This next article is a cool approach about documentation. It's by QA specialist, Ivan, who details how to eliminate the need for traditional API documentation by implementing self-documented testing payloads within a TypeScript based quality assurance framework. And this article offers practical guidance to improving reliability, consistency, and maintainability, and API testing workflows. Particularly in high complex environments like Fintech and crypto. And it outlines a strategy for using custom TypeScript types to address four persistent issues. It also makes the case for embedding documentation directly into code through strongly typed definitions, effectively removing the need to reference external API docs during test development. If you're using Playwright and you want to incorporate accessibility testing and have a dashboard, I have a resource for you. If you have any, hit me up on LinkedIn to let me know of a new tool called Axe Playwright Report, which is available on NPM for testers using Playwright. This library allows users to embed accessibility testing into Playwright workflows by leveraging Deque's Axe core engine and enables detection of accessibility issues in real time using UI testing and some of the key features that can help you find accessibility issues during Playwright test. It also has a really cool dashboard report.

[00:09:40] All right. And for links of everything value we covered in this episode, head on over to the links in the comments down below. That's it for this episode of the Test Guild News Show. I'm Joe, my mission is to help you succeed in creating end-to-end full-stack pipeline automation awesomeness. And as always, test everything and keep the good. Cheers.

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}
Test-Guild-News-Show-Automation-DevOps

Top Testing Moves This Week: Playwright Migration, AI Skills, Shift-Left Pitfalls TGNS162

Posted on 07/07/2025

About This Episode: Are you stuck in a test migration maze? Can you ...

Debbie OBrien TestGuild Automation Feature

Exploring Playwright and MCP with Debbie O’Brien

Posted on 07/06/2025

About This Episode: In this episode, host Joe Colantonio sits down with Debbie ...

Thomas Hurley TestGuild DevOps Toolchain

When the Cloud Isn’t an Option: Securing Your API Testing Strategy with Thomas Hurley

Posted on 07/02/2025

About this DevOps Toolchain Episode: In this episode of the DevOps Toolchain podcast, ...