AI-Assisted Testing with Mark Winteringham

7 July 2024, 01:07 PM

By Test Guild

Mark Winteringham TestGuild Automation Feature

About This Episode:

Today, we're delving into the exciting potential of generative AI in software testing with Mark Winteringham. Mark, an esteemed author and expert in the field, has penned his latest work, ‘Software Testing with Generative AI,' which holds the promise of transforming our approach to testing and automation.

In this episode, Mark takes us behind the scenes of writing his new book and discusses the challenges of researching a rapidly evolving field. He shares the importance of context, mindset, and technique when leveraging generative AI and provides practical, actionable insights on how to integrate AI into your testing processes effectively.

We'll explore how AI agents and prompt engineering can enhance exploratory testing, discuss AI's role in UI automation, and delve into customizing large language models to fit specific business needs. Mark also emphasizes avoiding common pitfalls, like overreliance on AI and misunderstanding its capabilities.

Whether you're a seasoned tester or new to the world of automation, this episode is packed with valuable knowledge to help you navigate the evolving landscape of AI in software testing.

BrowserStack Exclusive Sponsor

Are you tired of maintaining in-house automation grids, Or are you struggling to scale your tests due to limited device coverage? Ready to move into the landscape of automation testing?

Look no further; BrowserStack has a plug-and-play solution that helps you achieve your testing goals faster.

Introducing Automate! Automate is a fully managed cross-browser automation testing solution that provides instant access to over 20,000 devices to run your automation tests across browsers like Chrome, Firefox, Safari, and Edge.

+ With extensive support for leading test frameworks like Selenium, Cypress, Playwright & Puppeteer, Automate ensures full coverage for your tests.
+ Choose from over 3000+ device-OS-browser combinations to run your tests with minimal latency, powered by 19 global data centers.
+ You get a neatly summarized dashboard to see your test status alongside an arsenal of solid debugging capabilities, including a range of logs, video logs, .text logs, selenium logs, and many more.
+ There are several other advanced debugging capabilities, like interactive debugging, which lets you interact with the website under test while the test is still running. Another is Web Performance reports, which help generate Lighthouse reports and add assertions to test your web vitals.
+ You can even test advanced use cases like payment workflows, 2FA, and network simulations as part of your testing suites, which will give unparalleled accuracy to your websites and web apps.

Sounds intriguing, why don’t you give it a try for yourself & enhance your testing processes at scale? Support the show and try it yourself now testguild.me/browserstack

About Mark Winteringham

Mark Winteringham

Mark Winteringham is a quality engineer, course director, and author of “Software Testing with Generative AI” and “Testing Web APIs”, with over 10 years of experience providing testing expertise on award-winning projects across a wide range of technology sectors. He is an advocate for modern risk-based testing practices, holistic based Automation strategies, Behaviour Driven Development and Exploratory testing techniques. You can find him on Twitter @2bittester or at mwtestconsultancy.co.uk.

Connect with Mark Winteringham

- Blog: www.mwtestconsultancy.co.uk
- LinkedIn: www.markwinteringham
- Twitter: www.2bittester
- Github: www.mwinteringham

Rate and Review TestGuild

Thanks again for listening to the show. If it has helped you in any way, shape, or form, please share it using the social media buttons you see on the page. Additionally, reviews for the podcast on iTunes are extremely helpful and greatly appreciated! They do matter in the rankings of the show and I read each and every one of them.

Transcript

Download New Tab

[00:00:00] In a land of testers, far and wide they journeyed. Seeking answers, seeking skills, seeking a better way. Through the hills they wandered, through treacherous terrain. But then they heard a tale, a podcast they had to obey. Oh, the Test Guild Automation Testing podcast. Guiding testers with automation awesomeness. From ancient realms to modern days, they lead the way. Oh, the Test Guild Automation Testing podcast. With lutes and lyres, the bards began their song. A tune of knowledge, a melody of code. Through the air it spread, like wildfire through the land. Guiding testers, showing them the secrets to behold. Oh, the Test Guild Automation Testing podcast. Guiding testers with automation awesomeness. From ancient realms to modern days, they lead the way. Oh, the Test Guild Automation Testing podcast. Oh, the Test Guild Automation Testing podcast. With lutes and lyres, the bards began their song. A tune of knowledge, a melody of code. Through the air it spread, like wildfire through the land. Guiding testers, showing them the secrets to behold.

[00:00:34] Hey, if you want to learn more about Generative AI and software testing, you've come to the right place. Today we'll be talking with Mark Winteringham all about his new book, Software Testing with Generative AI. You probably know him, but just in case, if you don't, Mark is a quality engineer, course director, and author of the book Software Testing with Generative AI and the book Testing Web APIs. He has over 10 years of experience providing testing expertise and award winning projects across a bunch of different technology sectors. He's also an advocate for modding, risk-based testing practices, holistic-based automation strategies, behavior-driven development, and exploratory testing techniques. You can find him all over the socials. We'll have a link for in the show notes, but you definitely want to listen all the way to the end to get the most of how you can leverage generative AI for real with testing. No bias. You don't want to miss it. Check it out.

[00:01:29] Hey test automation engineers, are you tired of maintaining in-house grids? Are you struggling with limited device coverage? Say hello to browse the stack automate. Your ticket to effortless cross-browser testing. Get instant access to over 20,000 real devices and browsers , and you can run Selenium, Cypress, Playwright, and more with even better minimal latency. Get crystal clear test summaries and powerful debugging tools at your fingertips. You'll also get interactive debugging, web performance reports, and support for advanced use cases like 2FA and payment workflows. Scale your testing without the headaches. Browser stack automate because your apps deserve the best. Rate as supercharged testing. Try browser stack automate got for yourself today. Head on over to TestGuild.me/browserstack and check it out for yourself.

[00:02:26] Joe Colantonio Hey, Mark, welcome to the back to The Guild.

[00:02:28] Mark Winteringham Hey, Joe. Yeah, it's good to be back again. Third time now? Fourth time?

[00:02:35] Joe Colantonio Yeah. At least. It's been 3 or 4. So it's great to have you back.

[00:02:38] Mark Winteringham I'm actually wearing my Selenium conference T-shirt.

[00:02:41] Joe Colantonio Oh, nice.

[00:02:42] Mark Winteringham From the last time we did this, when we did a live stream.

[00:02:45] Joe Colantonio Perfect. Perfect. Yep.

[00:02:47] Mark Winteringham In honor.

[00:02:48] Joe Colantonio Sweet. Mark, you're always busy, and I'm always curious. I asked you this last time, why write another book? I know you, you took like a long time to write your previous book, and I thought, oh, there's no way he's going to do it again. And you're later. You are a full time writer. What's going on?

[00:03:04] Mark Winteringham Yeah, I find myself asking myself the same question quite a lot at the moment as I near the end, the sort of the light at the end of the tunnel is almost there. It's sort of that final push. You do start asking yourself the same sort of questions. I think ultimately, it is just down to the fact that I enjoy writing. I enjoy the process of putting this all material together. I enjoy teaching, I've always enjoyed teaching, but what's been kind of different for this book compared to the last one? We're testing web APIs. I kind of went in with an idea of what the book was going to be. I'd been teaching web API testing for quite a while. I felt confident with what that was going to be. Whereas, this one has been much more of a collaboration with the publisher and focusing on Generative AI. Although the concepts been around for a while, the actual sort of kind of day to day public usage of it's only been obvious in the last couple of years with the rise of things like ChatGPT. It's kind of a greenfield sort of space to be exploring which makes it exciting. And it's really been fascinating finding out about sort of researching the tools and how they work, but also as well it can be a little scary, but fortunately, a lot of the stuff I've done in the past around my attitudes towards automation kind of infuse in this book as well. So that's sort of kind of given me a start for time as well.

[00:04:22] Joe Colantonio That's what I want to say. What skills allow you to do this? Because as you mentioned, it's not like, oh, it's not API testing everyone knows it, but at least there was like a known kind of thing that you've done for years and years. Now you're dipping your water in something that's like seems to be changing every month, every. How do you create a book or how do you research it to make it? So it's going to be when it finally is released that it's on point and it's still relevant?

[00:04:46] Mark Winteringham I think first of all, you accept that that's not going to happen, which is, it seems like a bit of a copout. But yeah, things keep changing and keep evolving. I would say this time last year, it felt faster than it does this time this year. I think things are kind of, sort of starting to coalesce and slow down a little bit enough I think with the broad strokes to sort of kind of take stock of how generative AI can be used in a testing context. In terms of like research, It was a lot of conversations with people, people within the testing community, a lot of tool providers, open access. And let me have conversations with them and pick their brains about AI. And it was sort of setting a lot of sort of passive ways of sort of learning about stuff. Lots of news let. There's lots of blog post reading using things like ChatGPT as well to so kind of dig into some of the meanings behind some of the tools and stuff like that. Sort of a combination of all those things. I've always been kind of an outcom- driven person in terms of how I teach and how I write. I always have a clear idea of what I want the reader or the person attending my workshop to do at a certain point. So although the kind of the information and maybe links to things change, Google really are giving me for a run for my money, they probably will rename Gemini to something else just before the book goes out. I just have to go through and change it from Bard. Those things will happen. But what I hope is, is that the kind of the core themes around the relationship between us as individuals with these tools will stay roughly the same because as much as the tools change, I don't think that that kind of the way that we engage with tools changes that much. There's a lot to talk about in that space as well.

[00:06:33] Joe Colantonio Is there anything that did change? You said it's kind of slowed down from when you started till now that you have to. Besides name changes, anything big as there have been a greater adoption to it than you thought it would be at this point in time. Or is there anything different than your assumption was a year ago to where it is now with AI, with testing?

[00:06:50] Mark Winteringham Yeah, I thought I would see more stuff in the tool vending space around it. I thought we'd see more of that sort of stuff. There was that big sort, kind of gold rush last year for sort of AI-driven products everywhere. And you've seen like little bits and pieces appear in tools. And I was lucky enough to see some demonstrations of some of the things that tool vendors were working on, but actually it hasn't pushed that far forward. And I think that partially, that's because there's sort of, we're in the trough or heading down into the trough of disillusion. But what I hope as well, in a more positive way, is I think that the people creating a lot of testing tools are starting to appreciate that this is not the center of your product. This is something that can be added to sort of elevate certain aspects of the product, which is similar to what I'm talking about in the book. It's not about replacing what you do, it's about picking those individual opportunities and then using these tools to help with that. I would say I take that as a positive really, that there's not been such a sort of kind of cynical sort of land grab in terms of how these tools are used. But ultimately, I think a lot of the other stuff kind of feels the same. People have sort of engaged with these sort of tools in a shallow way, they've come up with a few fun things. It hasn't really worked out for them. And then they've kind of either sort of dismissed it or moved on. Whereas like, yeah, I wanted to spend a little bit more time with a bit of a longer term to sort of to understand how these tools could really sort of be integrated into our work.

[00:08:23] Joe Colantonio What could someone expect to get out of this book if they did purchase it?

[00:08:29] Mark Winteringham I break the book into three parts. I've got a quite clear mental model in my head these days of how we approach it. I always sort of say that value with Generative AI is rooted in three principles, which is mindset, context, and technique. Mindset is very important and that's rooted in the mindset of what the value of testing is and how we do testing. Testing isn't a monoculture of test cases. It's a holistic whole of lots of different testing activities, from automation to explore to testing. Then technique is things like understanding how to use things like prompt engineering, and also in the mindset thing, as well as how do these large language models work, and appreciating that they don't think like us, they are probabilistic machines and that they have their limitations as well. In the mindset space then in the technique, it's yeah, it's things like prompt engineering. It's understanding things like AI agents as well, and learning how to like say this today is like how do you sort of knock that needle towards in your favor. If they're probabilistic machines, how do we sort of game the systems so the probabilities on your side in terms of the value you get. Lots of techniques that can be like that. And then context as well. Large language models are very general based. They're trained on such a massive corpus of data that it's kind of a bit of a garbage in, garbage out situation. If you ask it to create test cases for an upload feature, it will just generically come up with a bunch of test cases because it wants to answer your questions for you. Whereas, if we do something which is much more sort of test and upload feature, this is how it works. Here's some details about its technical specifications. Here's some business rules that are in place. Here's a bit of the code or the HTML. That sort of idea. You get a better response back. We need to bake context. And so that's looking at that as a principle. But also tools like retrieval augmented generation and fine tuning because those are the tools that can help you with baking the context in. So yeah, it's exploring all of those. And again, demonstrating as well like clear situations in which you could use these tools to sort of get the point across about how they can be useful in certain places.

[00:10:47] Joe Colantonio Perfect. I guess probably going quicker any further, most people might notice. But just in case, how do you define what is a large language model?

[00:10:57] Mark Winteringham In my mind it is a neural network that has some sort of transformer model applied to it. And then it's been trained with billions if not trillions of files of data, of text or images or audio files. And then inside that large language model, basically what you end up with is lots of individual sort of nodes or parameters that have like strong and weak connections to one another. That's where the probability comes in that help determine based on the word that it's generating, it will determine what's the next word to generate. What's the most probabilistic one. Happy birthday. Is height more probable than happy I've run over your dog. Terrible analogy.

[00:11:48] Joe Colantonio Then you see what really like. It's very unlikely to say that, especially on a podcast. But that's ultimately it's probabilistic model. Lots of parameters that connected to each other in that way. I would say, like my understanding of these is, shallow because I don't think we need to know the absolute intricacies of these tools because we are users of them. We are not people who are necessarily building them, but appreciating at least that they are probabilistic and appreciating that we don't. We should not be anthropomorphizing them. They are not us. They don't think like us. They are not creative like us. That is not a criticism. That's just what they are. And that is of value in certain places. And I think once you get that appreciation things, you get a lot more value out of it. And it's not changed with any tools, whether it's AI or not.

[00:12:42] Mark Winteringham I love that. You just as pretty early in chapter two. I think with the LLM in prompt engineering. Well, you set up kind of like what is prompt engineering, how to get the most out of it. And also you talk about hallucinations, a lot of times people you seem pretty even center where some people are like, oh, it just all garbage anyway. It's going to hallucinate. You can't trust it. And on the right you have, it's 100% true. You seem to come down the middle, especially in chapter two. Can you talk a little bit more about that? Like what people what the expectations you're going to sell? Like maybe what skills they need for prompt engineering and then what things they need to worry about? What hallucination, how common is it?

[00:13:17] Mark Winteringham Yeah.

[00:13:18] Joe Colantonio And is it all due to how well you do prompt engineering?

[00:13:20] Mark Winteringham Yes. Yeah. I think it's fair. I do sort of kind of sit in the center ground because yeah, it's kind of rooted in prompt engineering. But they are the techniques that you can use. But really it's about slicing. And it's also understanding what these models are very good at and what they're not good. Some of the examples I've seen shared that sort of kind of demonstrate its failings I think are a valid demonstration. If it's failings but don't use it that way, it doesn't mean it doesn't negate the fact that it can be useful in other ways, but equally as well. Like again, it's that's all kind of garbage in, garbage out sort of situation of more like generic in, generic out. So if you are approaching your problem, regardless of your tools, if you're approaching your problem from a very broad, high level, abstract way, then it's going to be hard to find a tool that solves that problem. Whereas, I think if we get the slicing right, we have. So like for example, you see a lot of people sharing examples of how you can use them to generate automate tests. I still think that to generalize, I want to use it to generate my page objects. I want to use it to help me build my utility code. I want to build prompts, unit tests as prompts with tools like Copilot to help me build my production code. I'm getting quite very specific slices where I think that it can be of use to me. I think that's kind of why I sit in that middle ground is because I think if you do try to go to generalized, you're either going to end up in the land of hallucinations because these models want to give you some sort of kind of response, some sort of kind of answer. And also I just don't think it doesn't do the craft of testing any service because it is assuming that basically it is an inherently algorithmic activity when testing is not, but it's algorithmic and heuristic. Yeah, that's kind of why I sit back sort of in that middle ground. Prompt engineering as a technique that's like the sort of specific craft. But if you're not, if you don't have the right sort of kind of mindset, the right sort of I, to sort of apply things like task analysis to the things that you're doing to break them down into specific chunks. Then you are going to run into problems like hallucinations. You are going to have issues where it becomes less value to you, or you end up over trusting these models and they lead you down the garden path the wrong way in your testing.

[00:15:47] Joe Colantonio Absolutely. I love how you said you have to understand what the large language model you're doing. What does? I know a lot of people were like ChatGPT 3.5 is bad with like math and figures like that. And so someone would do things with them and say, look, so you can't figure this out and say, well, it's dumb. Rather than realizing what it's good for and show an example of what it's good for. It's almost like they set up the strawman for it. I don't know why I put on that rant, but understanding what it, is going to help you, I think be better with it, I guess.

[00:16:16] Mark Winteringham Yeah, yeah. And again, it's a bit to play the devil's advocate that isn't an invalid way to prove it. There are a lot of papers out there that research these models and demonstrate how they're not very good at planning. And there's a real debate in that sort of space about whether these models are actually creative, can actually help you plan. But really like again, if you don't think of them as planning tools, but think of them as assistants and you're doing the planning. I think of it as like an area of effect model. I'm planning, I'm making the decisions, and then I use these tools to kind of enhance the different aspects of the plan that I'm implementing. Then they can be really valuable.

[00:16:59] Joe Colantonio How do you know what it can and cannot do them? What's realistic? Because the opening of my show was created by AI, all I did was say, write a song about Automation Guild. Spit it out like things like creative that you thought wouldn't be replaced are being replaced. Is that creative? Is it garbage? How do you know what to use it for? I guess, if that makes sense.

[00:17:21] Mark Winteringham Yeah, I think it depends on what you're. I think something I've been thinking about a little bit is like, where do quality characteristics fit into this? For you, that thing that you've generated, I'm going to assume is ideal for you because it's something that serves the larger product, which is your podcasts and the conversations and things are going on there. Whereas, if maybe it was sort of this is a band that's trying to get their first album out and they've just literally just got AI to generate all of it for them. Then there's less sincerity there, like the quality characteristics change and shift. You're looking for a band, you're looking for musicians who have channeled some sort of artistic sort of impression through their instruments. But that's all completely gone. It's why I think, for example, like in genres. I've mentioned this before in the past, like Rick Beato, the music producer, he's got big music producer on YouTube, does all these sort of videos and stuff like that. He talks a lot about how AI is slowly crept into the music industry. Things like auto tune are algorithmic based, and what it's done is, is it's nudged his ever, ever, ever closer to just fully AI generated. But I think that if you tried to do that in other genres of music, you probably get more pushback or you'd probably get people noticing it more. I do think like quality characteristics, we're just talking about music here, but anything like this, you can start to see that quality characteristics, context matters in terms of how you react to what's being produced. I slowed for a second because I was reminded as well, like a great example is the inverse. Someone would create a piece of work, written piece of work. They were writing part of their novel, they fed it in and they said, do the next two chapters and then deliberately didn't write it like that. They use it like as anti creative because they were like, well, this is what it's coming out with is formulaic. I can use it as a guide to go somewhere else. Again, it's that sort of how you approach it. We still have a say, we still have an impression. We are interpreting whether something is art or not, whether it's useful or not. The challenge again is making sure that we have a clear picture of what that is. And slicing can help because if it's not a broad, broad thing that it could work, might not work. Whereas, if you have clear, distinct ideas of what's right and what's wrong, it can be easier to react that way, make decisions.

[00:19:52] Joe Colantonio Make sense. I think you've covered a lot of this, like you said the first part, it's all about like mindset type of things. When you get to context then, can you talk a little bit about maybe the things that can help you with? From examples of the chapters? I think a really good is a rapid data creation. I know a lot of people struggle with data creation. I will soon be really good with data creation, but maybe some examples of what you think it's good for because you mentioned you don't think it's really valuable for creating all the tests for you, but you're most using it for a more specific slices of activities.

[00:20:23] Mark Winteringham Yeah. So yeah. The the data one is I think is one of my favorite chapters because it does clearly show like. How data management, it's one of the hardest things in testing at all, but actually focusing very thinly sliced in that space, you can get a lot of value from them. What's interesting, like on the context side is, is things like one of the challenges of data is its complexity in structure, for example, or the fact that you want to do things like data masking or you just want to generate some data, but it hits a lot of different tables. So things like retrieval augmented generation or RAG can be useful because you can actually allow access to your database and maybe say generate me a bunch of records based on these rules, and it will go and extract some example data from your database, pull that into your prompt to give it that much more context, and then you get some sort of response back. That's much more useful for you. If you care about certain domain words and want relationships between keywords and domains, then things like fine tuning can work because you're basically pushing those weights and balances towards your context. You don't teach it knowledge but you push that needle towards your context. So yeah, ultimately, that's what context is. It's not just being aware that you need to add some information, but it's also knowing what's the right information to add as well. There's a real interesting conversation as well with things like Gemini 1.5. It's got a context window of million tokens. You can literally put a book in there. And then say, I need to know a specific piece of information in that book. Is that good? I don't know. And maybe might not be. Would it be more effective if I just put the chapter in the paragraph in. I don't know. You've got kind of got to experiment with those sort of things. So context is about being aware of adding it in, but also having the ability to detect what's right and what's wrong. And you can use tools for that. Or you can kind of use your own sort of judgment.

[00:22:27] Joe Colantonio Great job. People definitely need to get the book to get their hands on that. Another chapter that stuck out to me was you have one called Assisted Exploratory Testing with AI, which I was kind of surprised by, maybe a little controversial if you just looked at the title. Maybe you give a little insight around that?

[00:22:45] Mark Winteringham Again, it's the same the same things like exploratory testing. I don't want an LLM to be an exploratory tester, but if I have some AI agents or some prompts that are available to me, and as I'm exploring and I'm like so we talked about the data example. You're oh, I just need 20 records quickly set up. But they need to be unique. And you use a prompt to generate all that for you and throw in, your writing, your test report notes. And you need to convert them into something that's a bit more legible or something that has like some sort of kind of sentiment analysis to sort of say whether the quality is increased or decreased and just sort of kind of add that into your report, then those sort of things can be used as well. The whole of that chapter again, is this slicing mindset of and I do a case study of where I do an exploratory testing session and I demonstrate where I'm using these prompts. I like that chapter because it's kind of building on some of the stuff that we've learned earlier on because we're pulling from like the data side, we're pulling from the old suggest me some ideas, here's some heuristics, here's some things I've done, suggest me some new ideas. Break me out the mold of my testing. It's sort of kind of combining those things together. And using tools during testing sessions is a big passion of mine. I love a good session where you use lots of different tools. So yeah, that's what I'm trying to sort of kind of convey in this chapter of, yeah, there are lots of different ways in which you can call it LLM. And maybe like I said, you just have a little prompts, save somewhere, or you can use agents that can help you with these sort of things as well.

[00:24:20] Joe Colantonio Great. And before that, you also have a chapter on UI automation. How to improve UI automation with AI. Do you just give a few more examples? I know you said like help you write page objects and like small slices like that. Does that help with anything else that people may not be familiar with a lot of people just think of like, it'll help me find a flaky locator, but are there any other use cases?

[00:24:42] Mark Winteringham I think yeah, that was one of the key examples that I put in that chapter. And then again kind of going back to data. Integrating with things like open API platforms to call a model that's sitting on a platform somewhere and get some sort of retrievable data that way. There's some of this stuff that I didn't really kind of because I just didn't feel like there was too much to go into. But I think is interesting is kind of using it for analysis, for results as well, and giving you sort kind of interpretations there. And as well it's sort of, it's a combination of that, but it's also kind of combination of leveraging things like GitHub Copilot, those sort kind of tools as well. This is a thing as well. Like we tend to talk about LLM to singular things. Whereas, I think some of the successful teams, whether you're using it to assist your own work or building products. You use different models for different purposes. I might use something like ChatGPT to like say generate page object model. But once I've got that in place, then I can use GitHub Copilot to quickly generate the code within my IDE. I've got my page object open in my IDE. Then I can then fly through. And then I can start baking in the business rules because like again goes back to context because if copilot starts going, oh hey, you want to enter a username. Here's my suggested username. Oh hang on, you've given me an email address and we actually do employee IDs or we do usernames. It doesn't know those sort of things. Using different models to help me get certain weights. But I'm still jumping in at different points to kind of sort of bake the rules in there as well.

[00:26:19] Joe Colantonio Nice, nice. You probably could ask this all the time. Besides prompt engineering, what skills do testers need to know about to work with AI? I would think they have to become better testers because then AI is going to replace tester. But maybe I'm nuts there. And also like do you see people hate this term manual testing, but you see that going away in 5 to 10 years. Like where do you see how can people use AI better, what skills they need, and will they eventually be replaced by the AI?

[00:26:47] Mark Winteringham I really like the idea. I think I've mentioned this nearly on every situation where I've an interview. I tend to be talking about Nicholas G. Carr 's Book, The Glass Cage, and he talks about the kind of the principles of algorithmic problems and heuristic problems. When you talk about manual testing, I think about it more in that sort of it's heuristic-based testing with elements of algorithmic tasks within it. And sometimes the needle maybe move towards algorithmic more, sometimes less, depending if you're like doing exploratory testing versus test cases. Again, it's that sort of being able to identify when you're in a situation where you're doing something that's algorithmic. Clear steps can be explicitly described. You can kind of tell when you're not following the path versus the heuristic stuff, where it's sort of more creative, more freewheeling. I think we need to get better at being attuned to that. And I think that's kind of one of the things have been talking about for a long time is that that sort of idea of using any sort of kind of test automation. It's great in the algorithmic parts, but it's not so good in the heuristic part. That kind kind of mindset. There is the potential. I've been thinking a little bit about this, about how it could open up automation more to new testers or like to those who have sort of shied away from the sort of more technical aspects because you can basically provide it with your instructions and then it turns the syntax, converts it into the syntax that you need. You don't necessarily know what the special incantations are to make the script work, but you can at least plan it how in your head and describe it so that I can kind of convert it. And it's the same things like for code reviews as well, like being able to provide blobs of code, assuming you're allowed to do that, but providing blobs of code to give you feedback that way. I think that there's more benefits than there are risks to testing roles. But I think again, it goes back to mindset. We need to be clear about where they have a purpose, where they have a use, where they don't. And try not to get sort of too caught up in the hyperbole regardless of perspective.

[00:29:02] Mark Winteringham Nice. All right. We covered part one and part two of the book you need to grab. It's really fine more. But like part three let's wrap it up here. At the time of this recording was called customizing LLMs for Testing Contacts. I assume this still is.

[00:29:14] Joe Colantonio Who knows?

[00:29:16] Mark Winteringham But when people hear LLMs, they're like, well, it doesn't work with my company because it's not trained on my data, to my situations and therefore it's useless. You had like three chapters here, like once is customizing our LLMs to help you better with testing and know how to fine tune LLms with business domain knowledge. Maybe, can you like sum up the third part a little bit more as a teaser so people can get their hands on the book to learn more? What are you talking about here?

[00:29:45] Mark Winteringham The first chapter kind of puts forward the argument of context is needed. All the things I kind of talked about, and then it sort gently introduces the two approaches that we can apply, which is retrieval automatic generation. We can add more context to the prompt. And we can do that programmatically. We can either connect it to log files, databases, things like vector databases that we hear a lot about when talking about sort of RAG. So it gives an introduction to how that works. And it gives you the opportunity to build your own. It's not going to make you some sort of RAG expert by the end of it, but it's enough for you to appreciate when that approach can be of use to open up context and things like that. And then the final chapter, yeah, it goes into fine tuning, and it's the same sort of thing. Going through the basics of how fine tuning works, letting you have the opportunity to fine tune something, but not necessarily to make you and AI engineer or machine learning engineer just enough that you appreciate that the complexity of fine tuning, but also that you don't teach a large language model through fine tuning. You just move the weights and balances in favor towards your context. So yeah, that's kind of what kind of pop three years. It's sort of taking it to that next level as well.

[00:31:04] Joe Colantonio Love it. So this is a must get your hands on resource. We'll have a link for it in the show notes. But Mark, before we go, is there one piece of actionable advice you can give to someone to help them with their AI testing efforts. And what's the best way to get our hands on your new book, Software Testing with Generative AI.

[00:31:20] Mark Winteringham You can get a copy of my book on Manning.com. It's still in early access, so that means that if you purchase a copy now, you get access to the digital copy now and then once the book is finished, you will get a print copy sent to you if you choose that option. But what's really useful as well is that you can actually read the book, drop comments in and give me feedback, and I do read it, and I do react to it and I factor it into the book. Single bit of advice is basically don't be afraid to experiment with them. Don't be afraid to sort of try out different techniques and different approaches to and tweaking prompts. That's where my journey started with this was just, trying to get it to generate some SQL for me. And it went horribly wrong. But it's from there. It's from that experimentation. You learn more and you can sort of practice and get more engaged. And it's not something that's for the technically minded only. It is accessible to everyone and that's why it's so popular.

[00:32:19] Thanks again for your automation awesomeness. The links of everything we value we covered in this episode. Head in over to testguild.com/504. And if the show has helped you in any way, why not rate it and review it in iTunes? Reviews really help in the rankings of the show and I read each and every one of them. So that's it for this episode of the Test Guild Automation Podcast. I'm Joe, my mission is to help you succeed with creating end-to-end, full-stack automation awesomeness. As always, test everything and keep the good. Cheers.

[00:32:55] Hey, thanks again for listening. If you're not already part of our awesome community of 27,000 of the smartest testers, DevOps, and automation professionals in the world, we'd love to have you join the FAM at Testguild.com and if you're in the DevOps automation software testing space or you're a test tool provider and want to offer real-world value that can improve the skills or solve a problem for the Guild community. I love to hear from you head on over to testguild.info And let's make it happen.

Scroll back to top

Playwright + AI, Faster Migrations, Smarter Tests and More TGNS176

Posted on 12/01/2025

About This Episode: What tool is trying to give testers more control over ...

AI-Driven Manual Regression: Test Only What Truly Matters With Wilhelm Haaker and Daniel Garay

Posted on 12/01/2025

About This Episode: Manual regression testing isn’t going away—yet most teams still struggle ...

Top Automation Guild Survey Insights for 2026 with Joe Colantonio

Posted on 11/24/2025

About This Episode: About This Episode Automation Guild turns 10 this year, and ...