Mobile Mastery: Blending AI with App Testing with Dan Belcher

By Test Guild
  • Share:
Join the Guild for FREE
Dan Belcher Testguild Automation Feature Guest

About This Episode:

Today, Dan Belcher, co-founder of Mabl and a former product manager for monitoring and logging at Google, joins us as we explore the cutting-edge intersection of AI and mobile test automation.

We'll delve into the growing trend of AI testing in business and discuss the critical switch from deterministic to non-deterministic testing protocols when dealing with AI language models.

Get ready for an exciting announcement as Dan unveils the launch of Mabl for mobile. This groundbreaking tool will empower practitioners to validate their iOS and Android apps with unprecedented efficacy, revolutionizing how we approach mobile testing.

Plus, we'll tackle the pressing questions about the shift from custom frameworks to vendor-based solutions, the increasing need for testing from the user's perspective, and the potential roles AI could play in optimizing the end-user experience.

So, charge up your headphones and prepare for a deep dive into how AI's ability to understand context reshapes test automation.

To see AI with Mobile in action, make sure to check out now:

About Dan Belcher

Dan Belcher

Dan Belcher is a co-founder at mabl. Prior to mabl, Dan was a former product manager for monitoring and logging at Google, which he joined through its acquisition of Stackdriver, a company which he also co-founded. He spent his early career working on infrastructure and ops tools at Microsoft and VMware. Dan combines his technical experience with a broad and continuously updated view of the market to guide his product development teams.

Connect with Dan Belcher

Rate and Review TestGuild

Thanks again for listening to the show. If it has helped you in any way, shape, or form, please share it using the social media buttons you see on the page. Additionally, reviews for the podcast on iTunes are extremely helpful and greatly appreciated! They do matter in the rankings of the show and I read each and every one of them.

[00:00:00] In a land of testers, far and wide they journeyed. Seeking answers, seeking skills, seeking a better way. Through the hills they wandered, through treacherous terrain. But then they heard a tale, a podcast they had to obey. Oh, the Test Guild Automation Testing podcast. Guiding testers with automation awesomeness. From ancient realms to modern days, they lead the way. Oh, the Test Guild Automation Testing podcast. With lutes and lyres, the bards began their song. A tune of knowledge, a melody of code. Through the air it spread, like wildfire through the land. Guiding testers, showing them the secrets to behold

[00:01:00] Oh, the Test Guild Automation Testing podcast. Guiding testers with automation awesomeness. From ancient realms to modern days, they lead the way. Oh, the Test Guild Automation Testing podcast. Oh, the Test Guild Automation Testing podcast. With lutes and lyres, the bards began their song. A tune of knowledge, a melody of code. Through the air it spread, like wildfire through the land. Guiding testers, showing them the secrets to behold.

[00:01:58] Joe Colantonio Hey, it's Joe, and welcome to another episode of The Test Guild Automation Podcast. And I'm really excited about today, because we have joining us once again, Dan Belcher, to talk all about modern versus open-source. We've had him on the show two years ago, so I want to catch up on an AI, where we are on mobile testing a whole bunch of things. If you don't know, Dan is a co-founder at Mabl, which is a unifying test automation platform for delivering modern software quality. Prior to Mabl, Dan was the lead product manager for monitoring and logging at Google. So he really knows this stuff and really excited to have him join us today. Talk a little bit more about where we are, where we're heading, and learn about what he's learned over the years with AI since they originally came out with their AI-based tool. You don't want to miss this episode. Check it out.

[00:02:43] Joe Colantonio Hey, Dan, welcome back to The Guild.

[00:02:45] Dan Belcher Hey, it's great to see you, Joe. Appreciate you making the time for me. Also appreciate your work in this community.

[00:02:54] Joe Colantonio Well, I appreciate that. Thank you so much. And really excited to have you back on the show. I originally had you, I think, in 2017, right when Mabl was just coming out of the gate, and then we had you on the show in like 2022, and this was all before I think, ChatGPT really took off. I think, ChatGPT took off like in November of 2022. Can you just talk about, like where we are from where we last talked about like how things changed? Are you surprised by how things have changed or how have you seen this all along when you originally launched in like 2017?

[00:03:26] Dan Belcher Oh yeah, I'd love to say that I saw the birth of large language models coming and changing the world. But we did see, even in 2017, you'll remember that we founded this company to bring AI to software testing. So we just didn't expect how many different types of AI would be involved. The big if we fast forward 2022 to now, I think when we talked at 2022, it was kind of in the thick of Covid, figuring out the new world, digital transformation was everywhere. Everybody was trying to figure out how to get their businesses online when people weren't, they're shopping in stores and going to their local banks. And so in testing, there was a huge focus in that moment on supporting digital transformation, accessibility, performance. Those were the topics of the day. It's not that those aren't important anymore, but I think 2023 was all about cost savings, driving more productivity, consolidation, and so forth. And of course, 2024 so far is AI, AI.

[00:04:32] Joe Colantonio Yeah, absolutely. Even my new intro was created by AI. All I said was write a song about the automation Guild and bam, I said and do it in the medieval type of style and spit that out. So it's pretty crazy, with just that what it can do, it seems like it's hinge on things that we didn't think it was going to handle, so we thought it would handle more of the repetitive stuff, but it's almost handling things that are more creative as well. How does that impact how you develop your tools that you already started before, I guess. ChatGPT do you have your own language model you've been working on that handles this all for you, and how has it evolved since the big Buzz came out with ChatGPT?

[00:05:11] Dan Belcher Yeah, I think test automation is a very complicated thing. I don't think outside of our space, people really appreciate how difficult the problems are that we're solving. I think it's a kin to self-driving cars in some way. You need the AI to figure out what to do, and then you also have to automate it. And those are both big hairy problems to solve. So for us, we started with machine learning. And that's using machine learning to define the right kind of set of selectors to find UI elements and drive the tests. Then we move to expert systems. Big, massive, complex heuristics and rules based system that can inform the automation as it's out there on the roads, if you think of that analogy to make decisions based on what it sees in real time, and then more recently, we integrate in Gen-AI to supercharge that what we call autonomous agent, the thing that's actually executing the tests.

[00:06:13] Joe Colantonio Awesome. Yeah. So let's dive a little more into the Gen AI piece. So a lot says feel more familiar like keyword-driven or low code no code solutions. How is Gen AI? I've been seeing this in a lot of different tooling. How is this different than what maybe older automation engineers are more familiar with?

[00:06:29] Dan Belcher Yeah, I think the core thing is, and this gets to that debate about scripts-based solutions versus intelligent solutions.

[00:06:38] Joe Colantonio Yes.

[00:06:39] Dan Belcher At the core, when we're out running a test and you want to fill out the sign up form, right. In script-based solution, typically you need to have a fixed way of identifying each of the fields that you want to interact with. And let's say that's a string. That's an IDE. Or something like that, right? Historically, when we use this expert system, we would look at many different attributes of the field and what's above it and below it and around it and its location and CSS and all that kind of stuff together. And we can, as long as the strings match of some of those things, then we have algorithms that will decide whether it's still the same field or not. But until the large language models came along, we didn't know that the word save means kind of the same thing as submit, right? Or out of the office means the same thing as away from work. And so it's that type of language understanding that really has come to bear in making our ability to execute test without dealing with the flakiness or failures when you change the UI that you might have seen historically with script-based solutions. Does that make sense?

[00:07:52] Joe Colantonio Yeah. So the model is able to make inferences about like, I know I'm looking for like a log in, but, it's probably a sign up now where it's able to know at a runtime maybe how to modify itself based on how the application may have changed.

[00:08:05] Dan Belcher Yeah. Yeah, absolutely. And it's particularly good at cases where our existing machine learning and expert systems might be able to narrow it down to say, there were five images on this page. We used to know the image by ID that we wanted to use in the test, and that ID changed, but the image file name still means roughly the same thing. So what our existing AI will do is it'll grab all of the images, narrow down to the set of images that our candidates. And then now sometimes we can tap the Gen AI and say, all right, which of the image did it mean based on the original image?

[00:08:46] Joe Colantonio All right. So I think this is the future. Maybe I'm wrong. Multimodal AI where it's able just to do image-based so you don't even have to worry about selectors anymore. Am I overhyping what it already can do. Or is this a roadmap you see where we're going with type of automation?

[00:09:00] Dan Belcher Yeah, I think just getting back to the self-driving car scenario, I think we're going to end up with the analogy of, all the sensors come together to produce the intelligence. The visual sensor is one thing. Being able to look at the Dom is another thing. Being able to look at additional context is yet another thing. And those all feed together to drive the intelligence of the system. I don't think that visual in and of itself will be enough, but it will be a huge step forward for a lot of different use cases that are kind of hairy in test automation.

[00:09:36] Joe Colantonio Absolutely. I would just speak with someone earlier today, he wrote a book on Selenium, and one of the points that came up was why some people like Selenium! It's so flexible. You could start from scratch, create your own framework, and you know what's going on. But at enterprise level and different type of situation and I think vendor based solutions probably more appropriate, but people kind of feel like they'd lose out on that flexibility. How do you overcome that challenge? Or do you ever get that kind of pushback when someone says, oh, okay, AI, a low code, no code. How do I know what it's doing? How can I trust it?

[00:10:11] Dan Belcher Yeah, I've been dealing with this in my career for 25 years of-there's a lot of, I think, comfort and interest in building your own stuff. We're engineers. We like to build things. And when we develop domain expertise in certain areas, we really know can create a lot of value there. The question that I often ask is just, is this for your company where you should be investing that engineering talent to your competitive advantage. And just as right now we think it might, for example, email administration. 20 years ago email administrators, we would say, it's a very important thing for our company. We should be managing our own email. And now we don't even think about that. I do think we want to get to a world where you don't have hundreds of thousands of engineers trying to figure out how to reliably click a button in a UI, or building out frameworks that do a lot of the same things based on these open source packages. I think the levels of abstraction will rise so that we can each focus on the things that are unique to our business.

[00:11:26] Joe Colantonio Yeah, it's interesting when I start off as testers and then we got into automations and replace testers and now, we went into open source is going to replace vendor. And then vendors embraced open source. And it just seems like we're going back to now where and also the big thing oh, we're developers now. We develop code. And it kind of got away from testing almost. It almost sounds like AI, because AI was going back to we're testers and we're being assisted by AI so we can focus more on the testing rather than coding or the geeky tech aspect of it? Is that something you agree with or see?

[00:12:02] Dan Belcher I see it maybe a little bit different, but the similar. The one thing I'd say is, we saw this in other categories like observability, monitoring, logging, all that kind of stuff.

[00:12:13] Joe Colantonio Yeah.

[00:12:14] Dan Belcher I think what happened was you had vendor solutions that came in around the turn of the century. Usually, they ended up being part of HP enterprise. But you had those solutions that were actually pretty compelling and mature, but built for waterfall kind of static I.T. type of world. And DevOps and cloud came in and blew everything up. And so now we have a new way of working, building and shipping software. And the vendor solutions couldn't really make that transition. And so opensource played a critical role in filling the gap. For the next 10 years, open source, when you move from waterfall to agile, Selenium was only way to do it. But now I think, innovations and AI and cloud and all these kinds of things are coming together to change that equation.

[00:13:13] Joe Colantonio All right. So that's a good point then, sometimes people equate vendor with not agile.

[00:13:18] Dan Belcher Yeah.

[00:13:19] Joe Colantonio So that's obviously a myth. So maybe you can talk a little bit more about how, you start in 2017. So you start in the DevOps from the beginning from the ground. You built this tool. It sounds like with DevOps in mind, with AI in mind before a lot of other companies. So maybe you talk a little bit more about that.

[00:13:34] Dan Belcher Yeah. Yeah. Well, we knew from the beginning that for to work, for automation, to work in a DevOps context, it has to be reliable at scale, and fast and fit into a team's kind of modern workflow. And to my point before about the analogy to self-driving cars, that's just a really hard problem. And it's a hard problem for us with tens of millions of dollars of R&D investment and PhDs on the task. I can't even imagine how hard that is for a normal enterprise that has a much smaller team and not nearly the patience to spend 7, 10 years building, focusing on that, on that problem and only that problem. So I think that's really been the key is, it takes a lot of really hard core computer science to deliver automation that's reliable at scale when you're talking about millions of test runs a month, making sure that you're not stopping the trains, and it could deal with that kind of scale and velocity together.

[00:14:41] Joe Colantonio Absolutely. So it's a good point. So now, when I started once again was just like a small group of people that were part of testing. But nowadays, like you said, we need to deliver software quicker, faster, higher quality. So everyone kind of has a hand in it. So that's one of the downfalls of maybe using some of these tools because it takes a developer-tester mindset. So how can a solution that uses more of a low code approach help? Maybe, someone that's more like a business owner or try to get product owners of involved in as well to help contribute to the whole testing lifecycle.

[00:15:13] Dan Belcher Yeah. Whether it's testers or product owners or developers, the key is that these tools help them focus on the software from a user's perspective and not a code perspective or dom perspective. And somebody, different companies take different approaches to that, whether it's developers or product owners or QA who kind of own validating the critical user journeys or the critical API transactions or what have you. But the key is what we're seeing is if your average team can spend 90% less time just figuring out how to automate the flow, testing the flow. They can spend the time that they reclaim focused on, well, let me make sure that I have truly reliable, coverage at scale and that it's fast and that it's accessible. And that we're testing more scenarios and what have you. I think that's really the power is taking the complexity of the thing that really is adding the value for your business, which is how do I make these API calls or how do I drive this mobile app? Or what have you and more focused on what do I need to validate from a user's perspective? That's the magic.

[00:16:33] Joe Colantonio Absolutely. There seems to be a obviously it's always the end user, but it seems to be even a bigger push for the end user experience nowadays. We talk a little bit about post-Covid all that like is that one of the reasons why maybe there's been even a bigger push now from developing for users.

[00:16:49] Dan Belcher Oh for sure. I mean, you look at what's happened in the financial services, industry, great banks that built their brands based on the customer service in the branch that get to see their customers in person a fraction of the time. And they're figuring out, how can I deliver that, a quality of service that supports my brand from a mobile app that they're used to when they're dealing with people in person. The bar for what that experience looks like is leaps and bounds above what we've seen pre-COVID.

[00:17:24] Joe Colantonio Now, it's a good point also because I heard a mobile for a long time, but I never really saw like it taken off until the pandemic and also when everyone had to go, mobile first type deal. And so it's sort of like, well, everyone's interested in it. So with that comes challenges, have you seen any from your perspective working with a lot of different companies? Are there unique challenges to mobile app testing that you really didn't see with traditional web applications that makes it more difficult?

[00:17:52] Dan Belcher Yeah, but just the the process of testing new builds across operating systems and devices and mobile, it's just the pain is incredible because you'll find a tester who's been doing manual testing, who has dozens of devices sitting on their desk. And just to get to the point where you want to validate some test case, you have to take the device and make sure you have the right operating system installed. Now, you install your latest build and get that going, and then you can go through the flow for that one. But now you have all of the different iterations of the operating system and all the different versions of the devices and so forth. And so that pain, I think, is much greater than what it would be to do it in the browser where you have the kind of abstraction that Chrome and Chromium, kind of gives you away from the operating system and the device. That's been just an incredible challenge for people in the past.

[00:19:00] Joe Colantonio Now there's been solutions to handle this. I'm just curious to know how can you apply AI to help in mobile testing? Where would the AI piece come in rather than just a buzzword? Can it really help and where would it help with specific mobile app testing?

[00:19:13] Dan Belcher Yeah, yeah. So with AI, we see, first and foremost the automation challenges are even greater. I already talked about the set up before you can do the test, but the kind of underlying code that Dom, if you will, in the mobile context, has even less information available to be able to identify elements correctly than we see in the browser. And so AI becomes more important. And so just as we do in the browser context, we use machine learning and probabilistic models to define the right set of selectors when we create the mobile tests. We also think that that the multimodal large language models with visual are going to be more impactful on the mobile side, because as I mentioned, there's kind of less information behind the scenes in a mobile app to drive the test than you have on web.

[00:20:13] Joe Colantonio Nice, obviously, AI takes computing power. We're not just people know we're not degrading open source solutions, but is that one of the reasons why you don't see a lot of open source solutions that are AI kind of powered because it does take some sort of computing power behind it, like Appium, why can I just use Appium? I know you could probably do some sort of visual algorithm on it, but I'm not sure it could be through AI.

[00:20:37] Dan Belcher Yeah, I think of it more is it's not definitely not open source versus vendor in this context. It's more about automation solution versus a framework. What Appium gives us is a kind of a way that we can express the things that we want to do as we execute a test. It doesn't give us the intelligence to know what those things should be. And it especially doesn't help us figure out what happens when you're actually out there in the wild. In my autonomous car analogy, it's kind of like if you write an Appium script, it's like your old MapQuest directions. It's okay. Here's the ideal set of things that we should do. But when you actually get into the app and something isn't doing exactly what would I expected based on the underlying code, then historically, we've just failed the test. But the reality is, just changing the ID on a button doesn't change the user experience. And so what we expect is the system, just like the user is intelligent enough to still click the button. And those types of things do, to your point, require computing power to be able to observe what you see as you're driving down the road. And process that information and make decisions. And that's what intelligence really is, being able to process the data and then make decisions.

[00:22:04] Joe Colantonio Right. Speaking those decisions. Some people like I don't want AI making those decisions, but it sounds like when it does make a decision to do that the end user know, like, hey, you expected this, we did this. Is that okay? Like, how do you okay AI to go out and do its thing?

[00:22:19] Dan Belcher Yeah. Well for I think in in our world, especially in high velocity software teams, we are going to a place where AI is going to have to be able to make those decisions because there is just not enough time to have a man in the middle, human involvement in that, that couldn't process. What the humans can do is give the AI the direction upfront about the conditions under which they should make the decisions. In the previous analogy, you could say, don't take the toll road, right. Make sure that or if you encounter construction, stop there. The area that I think is just fascinating right now, and we're spending a ton of time with people, is how do you test the AI? We have a lot of customers that are building non-deterministic AI-based systems. And they're coming to us to say, I couldn't script this if I wanted to because I'm not getting back the same response every time I talk to the AI. I can't just say fail if it isn't exactly this. How do you help us with that?

[00:23:19] Joe Colantonio How do you help them with that? Because actually someone asked me about health care application, how they're using LLM, but they're like, how do we even test it?

[00:23:25] Dan Belcher Yeah.

[00:23:26] Joe Colantonio How would they test that? Because I've been asked this question a few times.

[00:23:30] Dan Belcher Yeah. Very carefully, Joe. Yeah. I think the core for many use cases and it's going to sound crazy, but it is, I believe, strongly correct, is in many use cases you have to use gen AI to test Gen AI. Let's use a I'll give you an example. Kayak has an Ask KAYAK AI agent. And what you do is you go to Ask KAYAK and you say I'd like to go skiing in July and Ask KAYAK will recommend places to go. And maybe you have 7 or 10 different options. ChatGPT I think gives them the options free to convey to me. Well, those options are different. Every time I search for skiing in July, they might be the same. They're in a different order. The descriptions are different, all that kind of stuff. And so the key is I need to know, are they giving me locations that are relevant to skiing? And can you ski there in July? I can't script that. All that I can do. And you do this in Mabl easily. All that I can do is turn around and say, ChatGPT, here's a list of responses to this question skiing in July, are these all good places to ski in July? And if ChatGPT says yes, I pass, and then I might ask anthropic the same question to get a double confidence. And then if they say, yes, I pass.

[00:25:00] Joe Colantonio All right. So you're using ChatGPT to test ChatGPT. Is this a common use case then people are using Mabl for that maybe you weren't expecting?

[00:25:11] Dan Belcher What I'd say is six months ago people talked about testing AI as like a long term problem. And then four months ago they started saying, wait a minute. This is like a 2024 problem. And now I see them doing it. I wouldn't say that I have hundreds of customers doing this, but for sure we have customers who are doing it because they have board level CEO level down to the CTO directive that they need to move full speed ahead on AI, and this is the only viable answer that they have right now.

[00:25:44] Joe Colantonio How do you the test know what's true then? If you have a ChatGPT test in a ChatGPT, how do you know that it's testing what you think it's testing when a set of tested it?

[00:25:54] Dan Belcher Yeah. Well, if you think of like again in my example. When it's not necessarily ChatGPT, it's the output of a generative AI large language model. And you would still have a test case about what you expect the response to be. But that response, the test case unfortunately, has to be less deterministic because the response is non-deterministic. So in my example with KAYAK, the test cases to verify that the location that they recommend or the locations that they recommend are appropriate for skiing and to verify that those locations have snow in July. And so the issue is for me to actually have 100% certainty around that, I would need to validate potentially thousands of locations and thousands across both of those dimensions. And the large language models can do it actually pretty effortless.

[00:26:56] Joe Colantonio All right. So it's kind of like a mindset shift because back in the day once again see all the gray here. Deterministic, I had a raised floor. I had servers under my control. I knew everything they were talking with. And I was able to say, okay, we tested completely, we know what's happening and now we move to like more getting comfortable with the non-deterministic world. We're in the cloud. We're using third party services. Things are happening in the wild. We have no control over it, just the way it is. So is that almost like what it is? We are moving from a deterministic part of testing to getting more comfortable with a non-deterministic type of testing.

[00:27:29] Dan Belcher Yeah, yeah. At least for a set of the test cases, there's just no avoiding it. I mean, and it's important in my KAYAK example, I'll say it's a beta product, but we've seen multiple hallucinations where they'll say, August is a fine time to go snowmobiling in Lapland, Finland, when in fact it's in the 60s. And maybe you shouldn't-July is not a good time to ski in this area of France, even though people do it in that area of France, because there's a big glacier, right. So there are areas where we see these inaccuracies. And I think the large language models will play an important role in helping us ensure quality, even in this AI strange new world.

[00:28:16] Joe Colantonio Yeah, absolutely. I guess going back to mobile testing, a lot of times, once it goes in production, it's usually out of your hands. But now I've been seeing AI that's be able to kind of predict user behavior or in real-time, say, oh, this feature that you just released is causing issues. Let's roll it back automatically. Do you see that playing a part in being able to improve our performance and ensure that a user interactions?

[00:28:40] Dan Belcher Yeah, I see it coming. And I do think that in the AI realm, we're at a moment where the world is going to shift a little bit from using Gen AI to get information to using Gen AI coupled with automation to drive actions. And I think test automation is the best example, that I know right now in that. But I think release pipelines are really interesting there as well. We need to get a little bit better at the AI, understanding the risk and change in releases to be able to make recommendations like we should roll this back or stop the release. But once those recommendations are there, coupling up with the automation isn't even the hard part to say, proceed with the deployment or rollback this release. I think. That machinery is not as complicated as the machinery we're talking about test automation. It's actually the AI techniques that I think aren't quite there yet.

[00:29:39] Joe Colantonio Gotcha. Very cool. So it sounds like to something's up, open source coexisting with low code and also deterministic coexisting with non-deterministic type testing.

[00:29:50] Dan Belcher Yeah. Yeah. I think they're both really important. I have to say, Appium, Playwright, and Postman, those play a very, very important role in enabling the automation of browsers and mobile apps and APIs, but they are frameworks. There's delivering an automation solution requires a whole stack of other data intelligence and infrastructure and so forth. The question is, do you want to build that, manage it on your own, or do you want to use a package or SaaS product like Mabl.

[00:30:27] Joe Colantonio Absolutely. Yeah. So speaking of SaaS, when you first rolled out, I think it was just for web apps. Did you make announcement recently where you were now full blown mobile as well support?

[00:30:35] Dan Belcher Today.

[00:30:37] Joe Colantonio Wow. Nice.

[00:30:37] Dan Belcher Today it's general availability of Mabl for mobile. It's been an incredible journey to get here. And so now our customers can validate iOS and Android apps without writing code natively. Alongside, their browser and APIs, and there's a lot of great interaction between people doing API tests and mobile tests together as part of the same suites and plans and so forth.

[00:31:06] Joe Colantonio How does that work then? Mabl able to know you make an API call and like how does that work integrating a mobile with the API together.

[00:31:14] Dan Belcher Yeah. There are a bunch of different use cases. One is I have a set of services that my mobile app interacts with. And if I have API tests running alongside my mobile test for those, that makes it really easy to isolate issues when there are failures. And I can also isolate latency. So I can see if my API latency has changed and what impact that has on the end user experience. The other side of that, of course, is it is almost always more efficient to use an API test to do set up and tear down. And let me reset some data or ensure that all the changes that I just made in mobile were effectively stored in the backend and that sort of thing. The API side is actually in both of these cases, to me, the exciting thing is we're bringing along a lot of people into automation that haven't been able to participate historically. So creating API tests without writing code, going from manual to automated on the mobile side without figuring out all the complexity of that, that's pretty inspiring stuff.

[00:32:19] Joe Colantonio I know some teams sometimes have a test suite that run against a web browser, but it's the same kind of thing. It gets a mobile device if someone's ever using Mabl, and there they have a test suite like, oh, now we need to run against a mobile device. Is this some sort of conversion or is this still too early with this new release?

[00:32:35] Dan Belcher Yeah, it's still early for that. A lot of our user journeys are pretty different between the mobile and web at the very highest level. We'd like traceability back to feature areas, but.

[00:32:48] Joe Colantonio Right, right. But you're using AI. You can figure that out.

[00:32:52] Dan Belcher Well, like JetBlue is a great example. A lot of them have very consistent experiences between the web and and mobile. But that hasn't been our focus. The one other area that I think is just in terms of like really exciting focus is with gen AI, not just to test Gen AI, but these multimodal models in Gen AI make a bunch of capabilities possible that we never thought would be possible in an automated way. So you can imagine now that if I mentioned JetBlue, if I'm on, where they've launched a bunch of new international routes and localization of those, the experience in those locations, it will soon be possible for me to say, when I'm booking a flight to France, make sure that all the text on this page is in French and none of it's in English. And that could be one prompt not a thousand rows of test data.

[00:33:49] Joe Colantonio Oh my gosh! I mean, I used to work for a health care company and we'd serve them in different areas in different countries. We'd have to have an Excel spreadsheet that we got certified for someone saying, yes, this is how it should look, and then we'd have to change the language on our machines and then look at it. So excellent use case. I mean that is a game changer I would think.

[00:34:07] Dan Belcher Yeah. There's a laundry list of those things where integrating Gen AI in the right way will make a lot of things that if you asked me two years ago, would this be possible with automation in the next two years, I'd say no way. And where weeks from that and a bunch of different cases.

[00:34:24] Joe Colantonio All right. So we talked about the visual aspect of AI understanding kind of what fields are and what buttons are. Is it also aware like context aware like hey, this is the log in. I mean, come on, I know what kind of data I need to use for log in or how to add a new patient. Are we able to get to that place where we're actually knows where it is in the application, what kind of data to use? Well, we actually having to go in and manually feed all the data?

[00:34:46] Dan Belcher Yeah, I'd say that the systems are getting increasingly powerful in that area, especially with large language models. Understanding the context and providing contextually relevant input. You can do that today with Mabl. It's through an API level integration with some of these models. But I think on the data topic, the one piece of advice that I'd have to practitioners and especially leaders, is to think about what you can do with the data from testing. So you have access now to all of these screenshots. You have the Dom of your app over time, you have the latency information for all of your pages, a good vendor like Mabl, and make all of that data available to you. And you can think about how you can add value to your team's efforts through analysis of that data, even outside of the kind of automation tool. And so we're really excited about JSON solutions in that realm.

[00:35:44] Joe Colantonio Can you give some examples? Because that's, people might be like what? So you have all this data. You can use AI then to interpret that data to probably give you insights about other things. So for example.

[00:35:57] Dan Belcher Yeah. Yeah. I think one great example maybe is let's correlate the information on where defects are impacting end users. So I know where tests are failing I know what those pages are. I have my defect information. Let's join those together to provide some insights around, where do we need more kind of control or training or what have you in our teams efforts. Another example could be just in general, are we getting better or worse over time, over a longer period of time in terms of latency? The testing tools don't always have that full picture of latency looks like. But if you correlate your observability data and your testing data, you can see things like, well, where do we get slower exactly? And in which systems and what have you? I think the key is understanding that if we're doing the right job, the testing agents that are out there watching your app, they're observing all of these things. What does it look like? Is it fast, is it accessible and so forth. And then you have other things, system level agents that are observing the database calls and API calls, latency and so forth. And we should figure out a future where we can bring that data together and help you understand your team's performance or apps performance and so forth at a much higher level.

[00:37:24] Joe Colantonio So like it's almost so you're able to tap into open telemetry and you find not only did it fail, but I know it failed because of this query. Let me automatically fix it for you. Rerun the test. Good to go. Now you can even get rid of the part of maybe I'm going crazy, but do you see us get to maybe that point, I'd like maybe a simple level or at a basic level?

[00:37:43] Dan Belcher Yeah, I think actually maybe let's kind of my confusing answer on that. Okay. So an example of what I think where we can do a better job of making use of the test data is when you think of I'm running a performance test so I can see performance from an end user's perspective, from the tests perspective as I'm executing my test. And then you have your APM tool over here that shows latency from the server side and where the bottlenecks are in your distributed system, at the database or API tier or what have you. We haven't done a great job of bringing that data together. To give you kind of that end-to-end view from the load test all the way down through the observability system to the logs, when there are fails, failing transactions, actually seeing that log that log in in detail. So that's one great example of the data is out there. We just need to do a better job of placing it together between these different tools.

[00:38:43] Joe Colantonio Alright. I'm going to wrap it up. But speaking of where we've already leaped in two years, is AI probably hyped, overhyped, or under hyped if you see the next five years?

[00:38:52] Dan Belcher One thing that I think is overhyped is test generation from plain language. There people are using AI to basically output scripts that are going to be just as flaky and problematic as the ones that we've seen historically. I think the impact of gen AI in the ability to express types of automated tests that we've never been able to do before, I think that's way under hyped. And the importance of AI and testing AI I think is equally under hyped. So really excited about getting people hyped up a little bit on those AI capabilities.

[00:39:28] Joe Colantonio I'm hyped. Really excited, Dan. So I'm sure you're going to have some new adventures over the years and new developments. We'd love to have you back. But before we go, is there one piece of actionable advice you can give to someone to help them with their AI automation testing mobile efforts and what's the best way to find contact you and learn more about Mabl?

[00:39:46] Dan Belcher I'd say check out Sign up for a trial and I think you'll find it very easy to get started. In particular, check out the new native mobile app testing capabilities and the Gen AI features I think you'll get really excited about. What's possible using Gen AI and testing.

[00:40:06] Thanks again for your automation awesomeness. The links of everything we value we covered in this episode. Head in over to And if the show has helped you in any way, why not rate it and review it in iTunes? Reviews really help in the rankings of the show and I read each and every one of them. So that's it for this episode of the Test Guild Automation Podcast. I'm Joe, my mission is to help you succeed with creating end-to-end, full-stack automation awesomeness. As always, test everything and keep the good. Cheers.

[00:40:40] Hey, thanks again for listening. If you're not already part of our awesome community of 27,000 of the smartest testers, DevOps, and automation professionals in the world, we'd love to have you join the FAM at and if you're in the DevOps automation software testing space or you're a test tool provider and want to offer real-world value that can improve the skills or solve a problem for the Guild community. I love to hear from you head on over to And let's make it happen.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}
Testguild devops news show.

Browser Conference, OpenSource LLM Testing, Up-skill Test AI, and more TGNS125

Posted on 06/17/2024

About This Episode: What free must attend the vendor agnostic Browser Automation Conference ...

Harpreet Singh-TestGuild_DevOps-Toolchain

DevOps Crime Scenes: Using AI-Driven Failure Diagnostics with Harpreet Singh

Posted on 06/12/2024

About this DevOps Toolchain Episode: Today, we have a special guest, Harpreet Singh, ...

A podcast banner featuring a host for the "testguild devops news show" discussing weekly topics on devops, automation, performance, security, and testing.

AI-Powered Salesforce Testing, Shocking Agile Failure Rates, and More TGNS124

Posted on 06/10/2024

About This Episode: What automation tool just announced a new AI-driven solution for ...