About this DevOps Toolchain Episode:
Today, we are honored to be in conversation with Eran Grabiner, a seasoned professional in the field of Product Management, currently serving as the Director at SmartBear. With his rich experience, including a stint at the observability startup Aspecto, Eran brings a wealth of knowledge and insights to our discussion.
Learn more about Observability Meets AI: https://testguild.me/bugsnagai
In this insightful conversation, we'll dive deep into observability, exploring how different developers utilize various tools and types to monitor software behavior, the role AI plays in enhancing these processes, and how the landscape is evolving with the integration of advanced technologies.
Eran provides a glimpse into the future of observability, where AI-driven systems could revolutionize data collection and storage, potentially leading to significant cost reductions and efficiency improvements. He also introduces the intriguing concept of an AI observability copilot, a tool that could assist developers in complex tasks like debugging, all while maintaining a conversational interface. However, Eran also underlines the challenges that come with such advancements, such as data exposure and the need for context and long-term memory in AI models.
Throughout the episode, we emphasize AI's transformative power in development, its implications for developers' future roles, and the necessary guardrails to ensure data integrity and security.
Join us as we delve into these topics, navigating the pivotal shifts in software development and observability with expert insights from Eran Grabiner.
Try out SmartBear's Bugsnag for free, today. No credit card required. https://links.testguild.com/bugsnag
TestGuild DevOps Toolchain Exclusive Sponsor
SmartBear’s BugSnag: Get real-time data on real-user experiences – really.
Latency is the silent killer of apps. It’s frustrating for the user, and under the radar for you. It’s easily overlooked by standard error monitoring. But now SmartBear's BugSnag, an all-in-one observability solution, has its own performance monitoring feature: Real User Monitoring.
It detects and reports real-user performance data – in real time – so you can rapidly identify lags. Plus gives you the context to fix them.
Try out SmartBear's Bugsnag for free, today. No credit card required.
About Eran Grabiner
Eran Grabiner is the Director of Product Management at Smartbear, where he works on Observability initiatives. His journey with Smartbear began after the acquisition of his observability startup, Aspecto, where he served as CEO. With a background that includes diverse roles in cloud-focused startups, Eran holds MSc degrees in both Physics and Electrical Engineering.
Connect with Eran Grabiner
- Company: SmartBear
- Twitter: www.GrabinerEran
- LinkedIn: www.eran-grabiner-98760587
What is Observability?
Observability, in the context of software, is the ability to comprehend how it is functioning by examining external metrics. To draw a parallel, just as checking a person's vital signs provides insights into their health, collecting data on an application's performance enables developers to identify and resolve issues, thereby improving the overall performance of the application. SmartBear's focus on observability is a testament to the growing responsibility of developers to ensure the smooth operation of their applications in production.
The Rise of AI in Observability
Eran believes that AI will have both short-term and long-term effects on observability practices:
- Short-term: AI will enhance existing observability products by making data collection and storage more efficient, reducing organizations' costs.
- Long-term: Developers will shift from writing code to composing and validating AI-generated code blocks as software development evolves.
Observability Copilot: Generative AI for Debugging
Shortly, AI will act as a “copilot” for developers, analyzing data and providing insights to help debug production issues more quickly. By feeding an AI model with logs, code history, infrastructure details, and other relevant data, developers can receive accurate and actionable recommendations in minutes rather than spending hours investigating independently.
Challenges and Opportunities
Implementing AI in observability presents data security, privacy, and validation challenges. However, Eran believes these obstacles can be overcome through sophisticated data-sharing techniques and new cybersecurity solutions.
The most compelling advantage of AI in observability is not just its ability to reduce time and stress associated with resolving production issues, but its potential to revolutionize the field by mitigating human error. This underscores the transformative power of AI in observability.
Conclusion
As AI transforms software development and observability practices, developers must adapt their skills and embrace new tools that enhance efficiency and productivity. SmartBear's Eran Grabiner envisions a future where AI copilots work alongside developers to create more reliable, performant applications with fewer production issues.
Rate and Review TestGuild DevOps Toolchain Podcast
Thanks again for listening to the show. If it has helped you in any way, shape or form, please share it using the social media buttons you see on the page. Additionally, reviews for the podcast on iTunes are extremely helpful and greatly appreciated! They do matter in the rankings of the show and I read each and every one of them.
[00:00:01] Get ready to discover some of the most actionable DevOps techniques and tooling, including performance and reliability for some of the world's smartest engineers. Hey, I'm Joe Colantonio, host of the DevOps Toolchain Podcast and my goal is to help you create DevOps toolchain awesomeness.
[00:00:18] Hey, welcome to the show. Really excited about today's topic because we're going to be exploring everything about AI observability with Eran. If you don't know, Eran is the Director of Product Management at SmartBear where he work on observability initiatives. He really knows his stuff. His journey with SmartBear began after the acquisition of his observability startup, Aspecto, a really cool solution. You definitely need to check out. We'll have a link for it in the show notes after this as well. At Aspecto, that's where he served as CEO. He has a background that includes a bunch of different roles and cloud-focused startups, and he has degrees in physics and electronic engineering, so he really knows his stuff. And if you want to know anything about AI, how it's going to impact observability and DevOps, you don't want to miss this episode. Check it out.
[00:01:02] Hey, if you're app is slow, it could be worse than an error. It could be frustrating. And one thing I've learned over my 25 years in industry is that frustrated users don't last long. But since slow performance isn't sudden, it's hard for standard error monitoring tools to catch. That's why I think you should check out BugSnag, an all-in-one observability solution that has a way to automatically watch for these issues real user monitoring. It checks and reports real-user performance data in real time so you can quickly identify lags. Plus, you can get the context of where the lags are and how to fix them. Don't rely on frustrated user feedback. Find out for yourself. Go to bugsnag.com and try it for free. No credit card required. Check it out. Let me know what you think.
[00:01:54] Joe Colantonio Hey, Eran. Welcome to the Guild.
[00:01:57] Eran Grabiner Hey, Joe. How are you? Thank you for having me.
[00:01:59] Joe Colantonio Great. Great to have you. Really excited about this topic. Like I said, I guess before we dive into AI and get into the nitty gritty, just in case, for anyone that doesn't know what is observability. Would you explain what is observability?
[00:02:12] Eran Grabiner Yeah. So obviously, it's a wide term, but for me, observability is basically your ability to understand how your software is behaving just by looking at external metrics kind of thing. You can think about it as when we are checking if a person is okay, health wise, we're taking different stuff like we are measuring their temperature. And we are measuring those type of measurements in order to understand what's going on inside. And in the same way, in observability, we are collecting different types of data in order to understand what's going on inside our application, inside our system, in order to be able to debug it, improve it, and so on and so forth.
[00:02:52] Joe Colantonio Nice. What go into observability? Well, like I said, it must be a hot trend because like SmartBear acquired you. I don't know how long ago.
[00:03:00] Eran Grabiner Yeah.
[00:03:03] Joe Colantonio Why do you think observability was so important that a huge company like SmartBear would then say, we need to invest in this because this is something that's going to be all the rage?
[00:03:11] Eran Grabiner I think SmartBear, which has been around for some time now, it's a company that like was always focused on delivering value for developers and developer tools across the SDLC. And obviously, there are a bunch of different product lines within SmartBear and observability is the newest one. He didn't start with Aspecto that where like more acquisitions that happen before that companies like BugSnag and so on. I think that's what has changed in the last couple of years that maybe made this space more interesting for SmartBear is actually the fact that developers became much more responsible for this such of things on how their application is actually behaving in production, and more responsible to take care of issues. If in the past, things were much more separated and we had like the ops folks mostly focusing on production. Nowadays we see more and more developer ownership. We see that software is becoming much more complex, much more distributed. And we saw a lot of changes, in the last couple of years in the type of tooling. I think a company that is focusing on developers would definitely want to expand into the developer experience also within the production environments.
[00:04:17] Joe Colantonio When we talk about observability, we only talk about production. And why is that a change? Is it because, back in the day, we had everything under our control? And now we're in the wild. We're using third party services we may not even know about. Why the rise of observability than in production?
[00:04:32] Eran Grabiner Yeah. I think, for me personally, you shouldn't limit observability to be just in production, but it's definitely the focus because eventually, the idea is to make sure your software is behaving as you planning and making sure that you know about issues, as fast as you can, and that you can fix those issues as you can. The focus is definitely in production. And even if you're doing something in pre-production, the motivation is to have better software in production. So it is around that. And I think that in the last couple of years there were changes and it actually very much connects to AI because I think AI is just a big one, but just another change in this chain of changes where software became, as I said, like much more distributed much more as you said, third party tools, things that are managed if you're using different types of technologies, the velocity of software delivery became much faster. It's not like every year we're releasing a new version of our product. It is happening on a daily basis. And lastly, I would say, the way users are interacting with software, it changed a lot just because of cloud computing and mobile and so on and so forth. We are seeing much more user intensive products out there, and the way we are interacting with software changed. And this has to lead to changes in how you actually maintain the software and manage it.
[00:05:51] Joe Colantonio Nice. Let's get to the meat of the interview, AI everyone's talking about AI, I hear a lot about it, whereas it relates to functional automation. But I would talk observability with all the data points it's collecting. It's probably more efficient I don't know. Are we talking about AI or machine learning or are they the same?
[00:06:09] Eran Grabiner Yeah. So I think, again different ideas obviously on this topic. But I think, it is part of the same family of technologies. And eventually, if you want to put, like headline on top of it like software that can basically do tasks instead of us humans and can do it in a way that would be with context and intentional and would be able to draw conclusions and act upon them and not just fulfilling a very basic task. For me, machine learning is just another method to achieve maybe AI and those type of capabilities. And when we look at observability and AI, I personally think and maybe disclaimer here. I generally think that we humans, we are very bad at predicting the future. As we are looking more into the future, we are usually less accurate in predicting it. We should be very, I would say modest when we are trying to assess how things will look like, near future and long term. But I do think there are some vectors that and changes that are already happening and you can already see out there. But for me, you can definitely, I would say kind of like divide this conversation into short term things that are already happening and how they affect how we are currently doing observability, and maybe more long term effects on how AI is going to affect the overall software development. And what developers are doing. And if we look at the past, every evolution or evolution like that, that happened when we moved from machine language into more object oriented and into modern software, each one of these changes really changed how we look at software and monitor our software, make sure our software is doing what we expect. Obviously, in the long term, AI is also going to change software and how we develop code. And this will lead to, I would say like a second degree of changes after the changes that we might see in the more of the very near future.
[00:08:14] Joe Colantonio Absolutely. I know you can't make predictions, but I would think as a vendor you work with a lot of customers, you see a lot of scenarios. So you may have a better pulse on the heartbeat of it. Dumb question right here right off the bat is are all developers using observability and if not, will AI observability just take care of it automatically for them where they don't have to worry about implementing it?
[00:08:37] Eran Grabiner I think you know different type of developers are obviously using different tools and have different needs. But at the end of the day, if you look at observability and it's like broader term, I think that any developer is using some sort of observability. You have a code. This code is running out there. This code has an issue. And now you're trying to figure out what happened. Even just using like a simple debugger or looking at errors and logs that you created for yourself, this is all flow type of observability. And again, obviously different engineers depends on the type of product they are working on and also depends of their front end focus or back end engineers or engineers that are more focused on data and so on. They'll use different tooling. But I think each one of them will use something to the first part of your question. I think we can try and make predictions. And maybe in a year or five years I'll be here again and we'll try to meet it again. But as I said, I'm trying to divide it more short term and maybe mid long term. I think that in the short term we are definitely seeing multiple trends that are happening with products already. And also obviously, at SmartBear we are thinking about a lot as well. This is where I think AI or generative AI capabilities are looking into taking stuff that we are currently doing with observability and just making them better or more efficient. I would say that broadly, when you look at observability, it's about collecting the data a little bit of massaging it and then storing the data and then serving the data to the developer. And I think for the first few part, like collecting and storing the data, the stuff that we're looking at, the direction as an industry, I think it's mostly about efficiency. Like how do you, here's a big problem in observability relates to cost and performance, or we're just collecting more and more enormous amounts of data, and it's very costly. And it doesn't make any sense because at the end of the day, the developer is looking at a very narrow layer of this data that we are collecting. I think AI can be very helpful and already starting to be helpful in helping us making decisions of what data to save and what's not. Maybe being more automatic in terms of understanding which data is just a replicate of the other data is redundant, which type of data maybe we can store in a more cheaper kind of storage compared to a more expensive storage that is more high performance, being able to maybe manipulate a little bit, not manipulate, but maybe transform, taking data that is very expensive, very rich, and compress it into formats that would be cheaper. But making, I would say, of like an opinionated decision and making sure that you're choosing the right ones. I think when we look at like data collection and sampling, we will see a bunch of stuff that trying to make the existing observability products that we see out there just a little bit cheaper, maybe.
[00:11:42] Joe Colantonio Is that because the model is using the computing power gets cheaper over time?
[00:11:46] Eran Grabiner It's mostly, I think like in terms of cost, it's mostly nowadays they're on the data processing, querying, and storing the data. And if you can create different indexing or if you can make different decisions on what data you are actually saving, you can reduce the cost. But I think this is like only one part that maybe the user will not feel that much. Maybe the CFO will feel a little bit. The part where I think the user is going to feel much more is all around the stuff that we are currently seeing out there, that I think we can give it a headline of like kind of like Observability Copilot. Something that has generative AI capabilities that basically look at different signals of data and can make conclusions that are very complex for humans to make and also to be conversational with the developer or engineer that is trying to debug the issue. I think that if you look nowadays at that or a few years back or a bunch of developers trying to debug something together, it would usually look like, hey, did you check that? Did you think about that? What did we released yesterday? Is it might be there? Can you look at it? I think all these type of interactions that we are feeling when we are debugging something like a production issue is going to be transformed into an interaction with that kind of like AI agent that, again, can make faster conclusions and go and check things for us and bring the relevant data. And I think this can be massive. Obviously, it has a lot of challenges that comes with it, challenges that relates to just the data itself. How do you expose this data to your vendor or some of this data is very proprietary for the company. And just making sure that the relevant data is getting there and that the AI algorithms can actually make a conclusion out of it is not trivial. And also, there is a question of like how much data you are showing. I think those AI tooling, they are becoming much more efficient as you are exposing them to more data. For example, if I'm just taking just taking the logs that I have from an issue and feed these logs to an AI algorithm that resolve or answer them get would be much different. If I'm also enriching this AI model with my original code, and with information about my history of releases of software, and with my infrastructure and code, and with my additional metrics that I collected at the same time, and so on and so forth. As we are enriching it with more information, we are getting closer to getting accurate and insightful answers from the AI model.
[00:14:28] Joe Colantonio Absolutely. I mean, I'm not doing at this scale, but I created a Joe bot which I trained on all my podcast, blogs, and webinars, and the more info I give it, it's able to I can see conversations people are having with it, and it's able to answer questions that it was a programed to answer because it's making inferences from all the different data points it has that I could even think of. So that sounds like is that kind of what's happening the more data you have, the more inferences it can make and then?
[00:14:53] Eran Grabiner Yeah.
[00:14:53] Joe Colantonio I kind of.
[00:14:54] Eran Grabiner Yeah, exactly. It's the more data. And another thing that it's a big issue in AI industries is having context and long context. How this model can actually remember something that happened a few months ago or even a year ago. A lot of the stuff, when there is an issue in a production environment of any company, usually the person that fix stuff and like really do it fast is the developer with the most experience in the company. He knows the code the best. He knows more stuff that the team did and forgot about, and so on and so forth. And you run it. I think it's going to be much more valuable if this AI agent actually has this long context like he knows everything that happened in the company throughout the history. He knows like different scenarios that happened to us. And this is like not trivial at all, not just from technological perspective and the challenges that comes with that, but also, as I mentioned, like just organizational security and so on, challenges that it is not clear yet how we are going to solve.
[00:15:55] Joe Colantonio Great point. How does it work in the short term? When you said generative AI like this idea of like a copilot, a coworker?
[00:16:02] Eran Grabiner Yeah.
[00:16:03] Joe Colantonio So is it bubbling up insights? Is it making the changes? Or is it you saying, hey, look at this data. Tell me if you see any performance issues, like how does the generative AI piece work now in the short term?
[00:16:13] Eran Grabiner I think obviously different vendors are obviously looking at different data. But I think that mainly what we see out there is like taking it step by step, let's take for example, I have an error or I have a bunch of logs that relates to a certain issue I'm seeing. And so let's take this data, maybe enrich it with some other information and feed it to this AI agent and ask them to give us an idea of what he thinks or they think is the problem and what could be a potential fix, or where should they look in order to figure it out? I don't think that we will get into actually fixing or actually taking two actions anytime soon. Not that the AI models would not necessarily be able to do that. It's just that like we humans will probably not feel okay with it, but definitely being like a companion that looks at the same data you're looking and being able to draw fast conclusions out of it, answer questions, even simple questions that it might be the case that the developer can answer those questions nowadays, but it's just going to take him like 20 or 30 minutes to figure it out. While you can just ask this agent and it will bring the relevant data. Instead of like going to another product right now, or sifting through other logs or looking into just a paper trail of something that happened, he can just as this AI agent the information and get it much faster. Something that took a 30 minutes will take just like 2 minutes.
[00:17:41] Joe Colantonio All right. I love this. How do we get to the long term vision. And this is a realistic long term vision where.
[00:17:47] Eran Grabiner Yeah.
[00:17:48] Joe Colantonio Are you have observability in place, you have Opentelemetry or whatever you're using. It's able to know what changed. It also knows your code. So it could say, all right, you're in a production this failed, I know it has to do with, I don't know, a SQL statement whatever it is, and it knows how to fix it. And if fix it I would point to we get there is a realistic scenario that you see long term?
[00:18:09] Eran Grabiner I think long term that's an interesting question. And I think like me personally, when I'm trying to look at things long term, I think things will look a little bit differently because the role and the task that the developer is taking will change. And let's zoom out for a second. I think that every time we saw evolution or evolution with how we code and how we deliver software, we saw basically another abstraction layer. Every time, every version was with another abstraction layer that happened out there. If you start from just the evolution from machine code to assembly language and from assembly language to high level languages and from that to like object oriented programing and from that to the modern stuff that we see today with IDs and frameworks. Basically, each one of these steps created another abstraction layer for the developer, so they didn't need to take care of a bottom layer and could focus more and more at business logic and kind of like what they want the software to do. In the early days, we would need to focus on the actual performance of the machine, the CPU, and memory. And like our tooling came from companies like early in the days, like Intel that gives you like metrics about the actual CPU and so on, or even physical machines that we had in order to measure how the computer is behaving. So through the days we dug the production layer, we had like different the different tooling and even the latest evolutions that we see with cloud computing and mobile and so on. So we saw like the modern application performance monitoring tools and reuse their monitoring and all those types of solutions. They basically came out to answer those evolution. And if we're looking at AI, I think that it is clear that developers will focus much less on the code itself. Definitely, even in the near future, AI will be able to generate code by itself, complete blocks of code that does a certain functionality, and the role of the developer will transform gradually into kind of like a composer that is collecting different blocks and need to combine them together into something that answer the product need. And I would even go further and claim, I think that the role of architects and designers is going to be much more interesting because they could focus more into the business logic and what they want to achieve from the system. And like high level decisions and the role of the developer will gradually become something maybe even similar to the QA engineer, because you just need to validate that the code that the AI generated is actually doing what you expected. And making sure that whenever you connect the two things together, they behave as you expected, but you will less and less and go into the code itself and make changes in the code itself. And yeah, I'm already talking too much about it, but we can try and think how it will affect the kind of tooling that we have and what type of tooling do we need to validate this code, code that humans needed to write this code. What type of validation do you want to have on a code like that? How do you do that? And so on.
[00:21:23] Joe Colantonio Love it. You heard it here first. It's the first I heard someone saying developers will be replaced by QA. Usually the QA people take them to be replaced by AI. This is awesome. I love that point of view. But yeah, no, it's not similar than what it is now where we have libraries, this high level libraries we're just putting together, it's almost abstracting. That's even a higher level now. Well, we're not being replaced. No one is replaced by better libraries. Hopefully that's the same here.
[00:21:46] Eran Grabiner Yeah. You can just focus on other things. And there are definitely things that it would still be very hard definitely in the near future for AI to complete stuff like just be innovation process of product design in architecture, like figuring out what the problem you would like to solve and how you're approaching it, and what is the overall architecture you would like to see where this thing is actually going to run, how users will interact with it basic, at least part of algorithm development, just the overall integration between different systems and different tools. Our world is becoming more and more connected. We at SmartBear and BugSnag we are helping companies to trace and analyze their our user interactions with your system coming from mobile platforms and IoT platforms all the way through their backend system. This is still going to be this interaction across components still going to be very complex. All day user experience part actually, UI, and those type of things. It's going to be very, very challenging because you have a lot of human experience that comes with it and like really understanding how humans are behaving and what they are trying to expect. Obviously, AI is going to help with it, but I don't think it's going to replace humans anytime soon. It's just going to be another tool in our toolkit that helps us do things faster and better.
[00:23:10] Joe Colantonio Right. Absolutely. In the short term we're in right now as well I would think. Someone has the solution.
[00:23:16] Eran Grabiner Yeah.
[00:23:17] Joe Colantonio And it uses AI. How do you get AI adoption then that get people to believe in it, I trust it I guess a lot of people like how am I going to trust this? I have scenarios where an AI is testing, an LLM AI and how do you know, like who to trust, what's going on? Like, how do you handle that part without even to get to the long term.
[00:23:36] Eran Grabiner I think that's obviously that's a huge question. And I think that it's not just us developers. It's also like it's a huge question for society. We see the Congress is trying to ask these questions. I think that at the end of the day, there are few I would say like key things that are happening here. First of all, if you are experiencing amazing value coming out of AI, humans will find ways to use it. And we are already seeing it right. It is very unlikely that we will have something that makes our life so much better and so much easier and we will just make a conscious decision to avoid it completely. Maybe some people at some areas of the world. But I don't think that as a society it will happen. And now it comes to the question of like how you are actually using it and what guardrails you will put in place. And as I mentioned at the beginning, I think this is something that organizations are going to be very concerned about. And they'll look for ways sophisticated ways. How do you share data without compromising it, how you can validate that a person you are interacting with, or even a machine you are interacting with is actually who they say they are, questions of identity and so on. And I think, it's only it's going to just increase our already huge cyber security industry. We are going to see a lot of new tooling out there that are dealing with those type of things, like how do you validate the data is actually reliable, that identity is actually how do you add signatures to these type of things? I think it is all solvable. It will take time. But I think technology can solve most of these problems.
[00:25:17] Joe Colantonio Absolutely. Okay, Eran, and before we go, what is the most significant advantage you think of integrating AI into observability practices that you want to highlight for our listeners, and maybe the best way to find or contact you?
[00:25:28] Eran Grabiner Yeah, sure. What's going to be the most significant? I think it's like no one likes to solve a production issue. It's not a task you like to do. It usually comes in a time that you don't want it to come, Friday evening or in the middle of the night. And it's usually very stressful. And I think that AI can possibly reduce those events dramatically. And the time we're spending on these events dramatically into something that we need to invest much less as organizations in it. I think that most big issues that are happening in our production environment eventually are human related. Someone release something without thinking about it, without thinking about certain integration, making a change that then I think is going to break something and so on and so forth. And I think that AI can be very, very helpful in identifying human errors. I think this is going to be very, very valuable. For me personally, you can reach out either way you like LinkedIn, Twitter, whatever you're using SmartBear I'm happy to share more thoughts and help folks that are thinking about those problems and looking into this area.
[00:26:45] Remember, latency is the silent killer of your app. Don't rely on frustrated user feedback. You can know exactly what's happening and how to fix it with BugSnag from SmartBear. See it for yourself. Go to BugSnag.com and try it for free. No credit card is required. Check it out. Let me know what you think.
[00:27:06] And for links of everything of value we covered in this DevOps Toolchain Show. Head on over to Testguild.com/p150. So that's it for this episode of the DevOps Toolchain Show. I'm Joe, my mission is to help you succeed creating end-to-end full--stack DevOps toolchain awesomeness. As always, test everything and keep the good. Cheers.
[00:27:29] Hey, thanks again for listening. If you're not already part of our awesome community of 27,000 of the smartest testers, DevOps, and automation professionals in the world, we'd love to have you join the FAM at Testguild.com and if you're in the DevOps automation software testing space or you're a test tool provider and want to offer real-world value that can improve the skills or solve a problem for the Guild community. I love to hear from you head on over to testguild.info And let's make it happen.
Sign up to receive email updates
Enter your name and email address below and I'll send you periodic updates about the podcast.