Cut Weeks of Manual Test Design into Minutes Using GenAI with Joe Colantonio

28 September 2025, 06:17 PM

By Test Guild

Join the Guild for FREE

A man with glasses and a beard speaks into a microphone. Text reads: "Cut Weeks of Manual Test Design to Minutes Using GenAI with Joe Colantonio. Supported by Keysight.

Learn more about our Strategic Partner

About This Episode:

Are your tests falling behind your requirements?

Check out our new AI Requirement course here: https://testguild.me/aireqcourse

In this special highlight episode, Joe Colantonio shares lessons from a new course created with Keysight: Turning Requirements into Reliable Tests with AI-Augmented Design.

You’ll hear from AI strategist Noelle Russell, Keysight’s Chief AI Officer Jonathon Wright, and product expert Ian Sharp as they explore:

Why responsible AI is like raising a baby tiger — exciting but dangerous if ignored.

The hidden costs of manual test design and why happy-path testing isn’t enough.

How AI-powered tools like Keysight Generator transform natural language requirements into reliable, traceable test cases in minutes.

If you’re a tester, automation engineer, or QA leader looking to cut weeks of manual scripting, improve coverage, and finally align tests with real requirements — this episode is for you.

Check out the full Keysight course here: https://testguild.me/aireqcourse

Become one of our Automation Guild 2026 speakers: https://testguild.me/agspeak

Exclusive Sponsor

This episode is a special highlight reel from our brand-new Keysight course, Turning Requirements into Reliable Tests with AI-Augmented Design (https://testguild.me/aireqcourse). You’ll hear expert insights on how to move beyond brittle manual scripts, uncover hidden risks in requirements, and use GenAI to generate reliable, traceable test cases in minutes.

Want the full experience? Register now to access the complete course with step-by-step guidance, real demos, and frameworks you can bring back to your own team.

Sponsored by Keysight – helping QA and DevOps teams build smarter, faster, and more reliable testing practices.

About Joe Colantonio

Joe Colantonio has over 25 years of experience in software automation testing and is the founder of Test Guild–a dedicated independent resource for actionable and real-world technical advice from industry experts (blog, video tutorials, podcasts, and online conferences)to help improve your DevOps automation, performance, and security testing efforts. He is the host of the longest-running podcast on automation testing. He has interviewed over 400 industry leaders and is the creator of Automation Guild, the premium, annual online event for test automation engineers. Joe has also hosted over 500 hours of live online events for some of the biggest companies in the software testing space. Joe's company, Test Guild, also provides services to software companies to help them reach a community of highly engaged, growth-oriented, and ambitious audience of 40k+ software testing professionals.

Connect with Joe Colantonio

- Company: www.testguild.com
- Blog: www.testguild.com/blog
- LinkedIn: joecolantonio
- Twitter: @joecolantonio
- YouTube:joecolantonio

Rate and Review TestGuild

Thanks again for listening to the show. If it has helped you in any way, shape, or form, please share it using the social media buttons you see on the page. Additionally, reviews for the podcast on iTunes are extremely helpful and greatly appreciated! They do matter in the rankings of the show and I read each and every one of them.

Transcript

Download New Tab

[00:00:00] Hey, let me ask you this. When was the last time your test suite kept up perfectly with changing requirements? Probably never, right? And if like most teams, you probably have felt the pain of brittle tests, fuzzy requirements, and spending weeks on manual scripting only to end up with gaps in coverage. And that's exactly why I'm so excited about today's episode. We're going to be pulling back the curtain on a brand new course, the test guild has just released in partnership with Keysight called Turning Requirements Into Reliable Tests Using AI Augmented Design. And don't worry, this episode is just a highlight reel. You're going to hear from industry experts like Noelle Russell, Jonathan Wright, and Ian Sharp. They'll share why manual testing is overdue for a rethink. How context makes or breaks test design and how GenAI can help teams slash weeks of work down to minutes. All with real world examples and real-world insight. And if you like what you hear today, you'll definitely want to check out the full cost by registering down below.

[00:01:05] But before we get into it, I just want to let you know that Test Guild has officially opened our call for speakers for our 10th annual automation guild conference that's scheduled to run February 9th to the 13th of 2026. This online event focuses on end-to-end automation testing and seeks presentations that address real world challenges facing the testing community. I survey our community every year and ask them what's the number one thing that they're struggling with. And then based on their input, I compile a list of topics they asked for. We'd love to hear your idea for a session at the next automation guild. Once again, you can register as a speaker using the link down below. Let me know your thoughts and love to see you there.

[00:01:47] All right, so let's kick things off as we hear from Noelle Russell. You've probably heard of her. She's one of the leading voices in AI ethics and strategy. She's worked at companies like Amazon, Microsoft, and now runs the AI Leadership Institute. In this course, Noelle explains why AI models aren't just tools. They're like living systems that need careful handling from day one. And with all the code vibing going on, this is definitely something you need to keep an eye on. She uses the metaphor that really stuck with me. That AI is like a baby tiger.

[00:02:19] Noelle Russel Now I've come to think about this in a very specific way. As a matter of fact, my journey went to, as I went to Microsoft, my job was to really look at a bunch of research models and help those research models get to production. I used to use a term like herding cats, right? Like I'm trying to get everyone like doing the right thing and managing engineering work. But when I think about AI models, cats wasn't a really good analogy because it wasn't really like herding cats. It was more like herding tigers, baby tigers. As a matter of fact, it's why I have them all over the place, baby tigers. And the reason I kind of think about AI models like baby tigers is because they're very cute at the beginning, fluffy, adorable, people want to work with it, it's very novel and exciting, people are enamored with these models in the beginning. I mean, just remember the first time you used ChatGPT or large language model, right. And you're like, you entered in a question. It gave you an answer and you're kind of like, oh my gosh, that's so cute. That's pretty good. That's not terrible. We get excited about that because it's doing something we never thought it could do. However, in this early stage of development, when a model is in baby tiger mode, we really need to ask some very important questions that are hard to ask when you're excited. They need someone in the room, hopefully you, they will raise their hand and go, well, have you thought about like. Hey, baby tiger, look at those paws. How big are you going to be or razor-sharp teeth at birth? Like, what do you eat? How much do you. Where are you're going to live? What happens if I don't want you anymore? These are critical questions to ask as you are building your AI systems and the earlier you ask them, the better.

[00:04:12] Joe Colantonio All right. So I love that image. It's so easy to get swept up in the excitement of Gen AI, especially when it gives us something clever in seconds, at least it looks clever until you start diving into what it actually generated for you. But as Noelle said, if you don't ask the right questions early about things like size, scale, security, governance, performance, you're going to have a tiger on your hands. And I've been obsessed with vibe coding for the past three months. And I could tell you, this is truly a real danger. It generates something that looks amazing until you start messing around with it and go, oh my gosh, the security leaks. This is a performance, it's not scalable. There's all these bugs that are built in because it doesn't really know a tester's mindset as it's creating it. Noelle also drops another stat in this course that for every $1 invested in responsible AI, companies are seeing $3.70 in return. And that's proof this isn't just about ethics. It's about business value. Now, in this course, Noelle also shares an awesome framework you can use called POET, which stands for Precision Optimize Ethics and Trust. Now I won't spoil the details here, but it's a simple way to make sure your AI strategy scales safely while still delivering your company ROI.

[00:05:28] Joe Colantonio All right. So what about the hidden costs of manual testing? Well, next we're going to shift from AI strategy to a challenge testers know all too well, and that is manual test design. So Jonathon Wright, the Keysight's chief AI officer and automation cyborg has been in the trenches of enterprise testing for decades. He has a bunch of knowledge. If you ever see him at a conference, definitely pick his brain because he knows everything about everything about AI and automation. And in this course, he shares why relying on happy path test cases leaves teams exposed to a bunch risk. Here's a clip where Jonathon uses the humble login requirement to show just how many edge cases can slip through the cracks.

[00:06:12] Jonathon Wright I think if we're all quite honest with ourselves, we talk about happy paths with happy data. So let's take the login example. Valid username, valid password, and then you press login, log on, maybe sign in, and you successfully sign into the application. Now, if I was writing that as a requirement or even a model, I would have unsigned up user, forgot my password. I'd have five or six different happy path and maybe negative paths through the system. Now, using model-based testing as a technique, how many permutations are they through there? You could say, Oh, 20, 30, 40 different values, different valid passwords, expired passwords. First time they're logged in with some requirements for a yule of an agreement for a privacy policy or something else, or terms and conditions. I don't know how that system works. I don't know how the system works for you. You might have some business rules that say, actually, if you're in a different location, like I'm in Seattle at the moment, it's going to send me a two factor authentication message to my mobile device, and then I'm going to have to pass that into my application so that I can securely sign in. Fantastic. How do I know that I've got to go through those steps or I have those business rules? Well, I kind of teased a little bit before it around the sources of truth. And when we talk about sources of truth or sources of information for a second, we're talking about context and context is absolute king because that's how your system works differently to any other system in the world.

[00:07:54] Joe Colantonio Alright, so isn't that eye opening? Something as simple as log into the system can balloon into dozens of scenarios, password resets, geolocation rules, two factor codes, business specific workflows, and a bunch more. But here's the reality, especially from what I've experienced on multiple different sprint teams, most teams don't have the time or bandwidth to design all those tests manually. So they just test the happy path. Maybe a couple of edge cases and hope it's enough. As we know, it never is. But Jonathon also makes a powerful point here that context is king. Without embedding your domain knowledge, you as the tester in the loop, the regulations, the policies, the business rules, the test suite doesn't reflect real-world behavior. So that's why you're needed so much more than now as a tester with domain knowledge with the assistance of AI to make you even more awesome than you already are. And that's why manual test design is overdue for a rethink. And in this course, Jonathon shows how modern teams he's interacted with all over the world are combining small pieces of messy requirements with AI and context documents to produce highly accurate test suites that evolve with the system. Look, as I mentioned, Jonathon knows software testing. He knows automation. He's not just drinking the AI Kool-Aid. He's a real-world practitioner that sees this working across multiple enterprises with multiple teams. So when he talks, I listen and you should too.

[00:09:28] Joe Colantonio How do we use automated test design with GenAI? Okay, so we've seen the risk of manual testing, but what's the alternative? So here's where Ian Sharp comes in. Ian is a product manager for Keysight Generator, which is their GenAI powered test automation engine. And in the course, he walks through how Generator works and why it's changing the way testers across multiple organizations and teams and companies they work with are doing testing. So here's Ian given a high-level introduction.

[00:09:58] Ian Sharp Keysight Generator is an AI powered test generation engine deployed securely and on prem that embeds the power of generative AI into your testing process. It's designed to streamline test asset creation directly from your natural language requirements. It combines modern AI technologies with a simple intuitive UI to put this generative capability in everyone's hands. And it ingests your real inputs. So you upload your real system requirements as well as the domain specific information that Jonathon was just discussing before, things like Policy documents or glossaries, which might provide that necessary additional context. And then Generator will create test assets from your requirements using all this information on the fly. And then it outputs ready to run artifacts. It generates Gherkin scenarios as Jonathon was talking about in feature files, as well as conventional manual test cases with execution steps and expected outcomes that then slot straight into your QA workflow. Now, because Generator is fully deployed on your own hardware and runs entirely behind your firewall, no IP or customer data ever leaves your network. So it's ideal for teams in tightly regulated or compliance heavy industries like healthcare, aerospace or defense. But at the other end of the scale, an agile team who operate in retail, for example, that might find they struggle to keep up with test authoring when faced with fast changing priorities or scope change. We'll also get huge value from it as it can generate tests really quickly from those fast changing requirements allowing the team to be more adaptable to change.

[00:11:29] Joe Colantonio This is where it gets really exciting for me. Generator isn't just cracking out random test cases. It's building traceable, ready to run assets based on your actual requirements and context files. If you work at an enterprise with really complex things like healthcare or insurance or finance, this is awesome. And as you can see, the speed is jaw dropping, especially if you had to do this manually over the many years that I have. And according to them. One customer cut manual scripting time from three weeks down to just 30 minutes using this exact approach. Let's hear another short clip where Ian explains how generator removes ambiguity. It even catches scenarios, human testers might miss.

[00:12:17] Ian Sharp To begin with, it removes ambiguity. Natural language processing within the embedded model interprets fuzzy requirements and produces clear unambiguous scenarios. So there's no more guessing and you get consistently written tests and scenarios every time. As we mentioned, it protects your IP and ensures data compliance. It's secure on-prem deployment, keeps all source requirements, context documents and any other data inside your network, satisfying GDPR, HIPAA and audit mandate. And it can take your test authoring timescales from weeks to minutes. It generates traceable test suites really fast, eliminating the slow error prone scripting grind. And we've seen this in real-world examples. One of our customers who usually expected manual scripting of test cases to take three weeks. They managed to get the same number of test assets out of generator in just 30 minutes. That's such a valuable time saving. And then testers can spend that time executing tests, analyzing results or doing exploratory tests to find those intriguing edge cases instead. Generator's context store adds the main depth. So you import your documents that hold the industry specific terminology and edge cases, and these are used by generator to build rich context relevant tests. So much that we've heard from customers that it has generated scenarios their human testers would otherwise have overlooked. And then we know systems don't stay still. So we ensure it's easy to keep tests evergreen, a simple one click generation updates assets whenever a requirement or your context changes so you can be sure nothing goes stale.

[00:13:49] Joe Colantonio So think about that. Instead of weeks of back and forth clarifying requirements, generator turns fuzzy statements into clear, consistent tests that a context aware and even services edge cases testers might not even think of. And that frees up you and your team to spend more time executing tests, analyzing results and doing exploratory work. The stuff humans are usually best at.

[00:14:15] Joe Colantonio All right, let's wrap this all up. From Noelle Russell, we learned that responsible AI is like raising a baby tiger. It's exciting, but it requires foresight and discipline, and it can deliver serious ROI when done right, which is definitely something you'll learn in this course. And from Jonathon Wright, we saw the hidden cost of manual testing and why happy path testing just doesn't cut it anymore. And for me, and we saw a real-world example how a Keysight generator is transforming test design. Kind of weeks of work down to minutes. While improving coverage and more importantly, compliance. And remember, this is just a highlight reel. The full Keysight course goes way deeper with real demos, frameworks and step by step guidance you could take back to your teams right now. Don't miss out. Make sure to register now returning requirements into reliable tests with AI augmented design using that special link down below.

[00:15:08] All right, so thank you, Noelle, Jonathon, and Ian for your automation awesomeness. For links of everything value we covered in this episode, head on over to testguild.com/a561. And if you haven't already, we'd love to have you speak at the next automation guild. Make sure to submit your idea now before the call for speaker ends in a few weeks. All right, so that's it for this episode of the test guild automation podcast. I'm Joe, my mission is to help you succeed in creating end-to-end automation awesomeness. And as always, test everything and keep the good. Cheers.

Scroll back to top

A Halloween-themed promotional graphic for TestGuild Automation Testing's "Optimus Prime Halloween Special" with Paul Grossman, featuring festive decorations and two men, highlights the fun side of test automation during Halloween.

Test Automation Optimus Prime Halloween Special

Posted on 10/19/2025

About This Episode: In this Halloween special, Joe Colantonio and Paul Grossman discuss ...

Testing Skyscrapers, AI Drift, Playwright Agents That Promise to Do It All TGNS171

Posted on 10/14/2025

About This Episode: Is the Testing Pyramid holding your team back? AI agents ...

Two men are featured in a promotional image for TestGuild Automation Testing, highlighting a session on Playwright AI Vibe Testing with Vasusen Patil and exploring the benefits of self-healing tests.

Playwright AI Vibe Testing: True Self-healing Tests with Vasusen Patil

Posted on 10/12/2025

About This Episode: Flaky Playwright tests got you down? Discover Vibe Testing, a ...