Welcome to Episode 64 of TestTalks. In this episode, we'll discuss testing with Wireshark with Ross Bagurdes, author of the Pluralsight course Introduction to Wireshark. Discover how to uncover hard-to-find bugs in your application using Wireshark.
Listen now to hear how to create a team collaborated, bug-feeding frenzy with Wireshark.
Wireshark is a protocol analysis tool. It's also a great way to bridge the gap between your network, developers and testers. Since the network is one of the most overlooked areas of software development, using Wireshark can help you to quickly uncover issues in your application that you may not even be aware of early in the development process.
In this episode, Ross Bargurdes shares with us how easy it is to get started with Wireshark.
Listen to the Audio
In this episode, you'll discover:
- How Wireshark uncovered a massive issue with a real-world medical device software solution.
- Why Wireshark helps teams to collaborate more effectively.
- Eas-to-understand explanations of packages, ARP and TCP
- Tips to improve your Wireshark efforts
- The best way to get started learning Wireshark.
Join the Conversation
My favorite part of doing these podcasts is participating in the conversations they provoke. Each week, I pull out one question that I like to get your thoughts on.
This week, it is this:
Question: How do you currently test network level issues with your application? Share your answer in the comments below.
Want to Test Talk?
If you have a question, comment, thought or concern, you can do so by clicking here. I'd love to hear from you.
Subscribe to the Podcast
If you enjoyed this podcast, please subscribe via iTunes. Click Here to Subscribe via RSS (non-iTunes feed)
Increase your Kama
If you like what you hear on TestTalks go to iTunes, subscribe, give us a rating and hopefully a five star review. If you’re cool enough to do that I will give you a personal shot-out on an upcoming episode!
Read the Full Transcript
Joe: Hey Ross, welcome to Test Talks.
Ross: Hi Joe, thanks for having me on the show.
Joe: Awesome. So Ross, before we get started, could you just tell us a little bit more about yourself?
Ross: I'm currently an educator. I teach at a local technical college here in Madison, Wisconsin. I teach IT Data Networking and I also teach full-time for Pluralsight, pluralsight.com, which offers a very wide variety of software and technical training. I teach on the data networking side, namely on how data networks work, the CCNA and WireShark.
Joe: Awesome. And that's what I'd like to talk about today. You have a great intro to WireShark course and Pluralsight. For people that don't know, what is WireShark in a nutshell?
Ross: WireShark is a protocol analysis tool. That's basically it. In data networking, my experience using WireShark and managing folks that have used WireShark to troubleshoot issues in a real business environment, they tend to have a real disconnect with the software folks. And WireShark is like, from my perspective, this bridge where it can take data networking and examine the software that's running on the data network and see if the software is behaving in a way that's consistent with the way that the protocols on the data network need to operate. And very often on the application development side, those folks, and this is not meant to be insulting at all because I know nothing about application development, or very, very, very little. I know enough to be dangerous, but to folks on application development often see the network as a mystery and they're not quite sure what's happening on there. And WireShark is an opportunity to really bridge that gap. I think it's a way of once you learn how to use that tool, it's a way of looking at your software and seeing what's actually being sent out on the network. It's not that hard to learn.
Joe: Awesome. And that's a great point. So one of the questions I was going to ask, and I guess I am going to ask it, is, is there anything you see most developers and testers overlooking or taking for granted from the network side, when they're testing their applications that maybe a tool like WireShark could help them with?
Ross: What myself and fellow network engineers see often is as a developer, especially one who may have a nominal understanding of data networking, they may just grab, if they're using Java, they just make a grab a Java library out of the pool that they're using, and that has the networking pieces in it. Depending upon where that Java library came from or who built it, it's going to behave on the network differently. And it may or may not be consistent with network protocols. I actually have an example of this. I spent about 7 years of my career managing a pretty large data network here in Madison, Wisconsin. It was a large healthcare group, and they used one of the most popular medical records softwares in the U.S. right now. It's very expensive software. It's very popular. And what their software was doing, is it runs on a massive UNIX server. And the UNIX server connects to a [inaudible 00:03:08] where the database sits. When a user logs into the application, a separate authentication request is sent from the medical record software to the authentication system, which in this case happened to be a Novell system for authenticating users.
And that Novell system was behind a Load Balancer. What the medical record software was doing was every time a user would authenticate to the Novell server, it would open up about a hundred to a thousand TCP requests. It would open a hundred to a thousand TCP sessions for each user that was authenticating. And then it would never shut the TCP session down. To give you guys an idea of what TCP is, it's just like making a telephone call. Telephone calls are becoming a bit antiquated, but if we think about the way you might make a phone call on a landline, we pick up the phone, we listen for the dial tone. Once we hear the dial tone, we dial the number. We wait for it to ring. After it rings, we wait for the person to answer. They'll say, “Hello.” We say, “Hello.” And now we can begin a conversation and transfer any information we want.
But if we don't do that first part correctly, if we dial the phone and then pick up the receiver and then hope that it rings, if we do any of that out of order, it's not going to work right. And then when we end our conversation on the phone, we do the same thing. We say, “Okay, I've got to go. Goodbye.” And then the other person you're talking to, if it's a graceful end of our conversation, they're going to say, “Goodbye.” Sometimes the conversation is abrupt. Maybe it's a telemarketer that calls you, so you just, boom. You hang up the phone. It's an abrupt end. Well, all these have some connection in the land of data networking and TCP. And in order for TCP to send any information across a network, it has to send this three-way handshake. It has to say, “Hey, I'm going to talk with you.” The other end says, “Okay, great, let's build the session up.” And then it sends another message back saying, “Okay, we're ready to talk.” And only after those three things happen can we begin to send the data that the application is going to use.
Well, to go back t the story about this medical record software, what it would do is, let's say a doctor logs into the software. It would send a message back to the medical records server. The server, then, would build a TCP session with the authentication server, in this case, Novell. That TCP session wouldn't just build one. It would build a hundred of them, or a thousand of them. It was unknown how many TCP sessions each log-in session was going to get. And I have no idea why the developers created the system, but the issue that it caused was that when our network team replaced our Load Balancer, the new Load Balancer, unknowingly, had a much lower limit of maximum number of TCP connections allowed through it. We're talking about two hundred and fifty thousand TCP sessions. This is not a small number. This is an enormous number.
What happened was one day after the Load Balancer was replaced, the medical record users started complaining that they could no longer authenticate after 1:00 in the afternoon. And that's when we brought in our techs with the WireShark experience and they hooked up WireShark and looked at all the pieces of what was going on there, did some Packet captures and Packet analysis. And that's when they discovered that every log-in attempt was using a thousand TCP sessions.
Well, that's just unacceptable in data network terms. It's unnecessary, especially for a single log-in attempt. Especially if that log-in attempt is encrypted. That one attempt only needs one TCP session, maybe 2 or 3. The whole point of the story is, is that the application developers, in this case, ignored how the network protocols were supposed to work and created a situation where the network hardware isn't designed for that type of behavior and the network hardware fails in those circumstances.
When our network folks try to have this conversation with the application developers, both groups got a little ego sensitive and were unable to have the conversation, then. But it's an important conversation to have because if we can both use the systems correctly, both on the infrastructure side of the organization and on the application development side, we can have some really amazing applications that we can develop with that.
Joe: Awesome, Ross. I love this story and it's a real world example of how things can go wrong. And as a tester, a developer, I'll admit. I'm probably one of those people that rarely think about the network and this type of example really should open- if someone's a tester listening to this podcast, should open up their eyes and say, “Hey, I'd better add this to my checklist of things that I'm checking while I'm testing.” Because it's going to ultimately affect the customer and if it's going to affect the customer, it's something we should be testing, I think.
Ross: Absolutely. Yes.
Joe: I guess my question, then, is how much networking skills would someone need in order to start using a tool like WireShark to be able to capture and understand these types of issues?
Ross: Not a whole lot. Developers have a leg up, in my opinion. There's two basic views that I see developers attacking data networking with. And the first one is they enter the world of trying to understand how the data network works with the expectation that it's magical, mysterious and complicated. And I can tell you upfront that when you approach it that way, networking is going to look exactly like that. It's going to be mysterious, complicated and almost impossible to understand. I know that through my own personal experience of data networking.
On the other hand, if you look at the skills that application developer, application tester, you name it, on that development side, they have really good logical skills. They understand a process. They understand approach of layering and hardware abstraction and software abstraction and these different components that you use in application development all the time.
A lot of those concepts apply directly to how we pass information through the network. Additionally, as an application developer, if you're developing an app for iPhone or Android or Windows phone, for that matter, you have an understanding of client-server relationships. You know that you have to have software on the server side and software on the client side, and they communicate over this middle piece. And the middle piece we consider a cloud, and that's great.
The opportunity to understand how the data network works is pretty easy. The leap isn't that far. It just takes some patience and persistence, because networking is riddled with terrible terminology and there's a lot of folks that misuse it. It's really hard to get that base level knowledge of how to understand what words we're using in data networking that are important and which ones aren't.
Data network can mean fifty different things, and sometimes it's very precise and important, and sometimes it's very loose and general. We really need to learn when what word is important and when it isn't.
Joe: Where in the software development life cycle would you think someone would use a tool like WireShark to have uncovered this issue before it was released to a customer to have found it with their Load Balancing?
Ross: As soon as you can start testing the software on the date network, you should start using WireShark and examining how it behaves. My experience with WireShark is WireShark is going to look like a haystack. You're going to have to look through that haystack for the needle that is your software communicating on the network. And that takes some practice to get used to. But once you get used to it and once you practice it for a while, and have some of the skills that you learn in the Intro to WireShark course and some of the future WireShark courses that I have coming, once you get those skills of how to examine that haystack of information, finding that needle becomes much easier. Once you find it, the reward of that is huge. It's almost like playing a video game. And when you can find it, the next time you look for how your software is behaving on the network, it will get easier and easier and easier. And you'll understand the way your software's operating in a whole new way.
I'm not sure how many of your listeners are aware of a hacker by the name of Kevin Mitnick. Kevin is a old-time hacker from the 80's, 90's. Did a lot of phone freaking and logging into phone switches to gain access to them. There's folks out there just like him that are looking at how those applications work on the network side. And if the application development doesn't know how it's working on the network side, that's a great opportunity for an attacker or some kind of hacker to compromise a software and make use of back doors or unintentional back doors in that software. The sooner you know how your software is behaving, the easier it's going to be to understand how it's working in the future, and you're going to understand the relationship of the little knobs and dials that the developer's changing in the application. You'll be able to see how those knobs and dials affect the performance and whatnot on the data network.
Joe: Awesome. Maybe it's something you run as a baseline as a one user, and then when you start doing your performance testing, maybe you can reference that to see, “Okay, what's different from my baseline compared to when I start ramping up users?”
Ross: Absolutely. And as far as a troubleshooting tool, it is literally the best one out there. At the healthcare organization that I worked at, we had the developers coming to my team, the data network team, to ask us to make use of our chief WireShark expert, who knew nothing about application development, or very little about application development. But he was able to use WireShark to pick out and show the developers exactly where the application was misbehaving. And typically once they had that information in WireShark, the developers quickly knew how to use that information to debug software. So yes, having that baseline of how the developer thinks the software should work and then being able to use that in the future for Load testing or troubleshooting of unusual anomalies in software.
Joe: So you brought up so many good points. The first one that comes to mind is Fiddler. I think some people might have familiarity with Fiddler. Is this the same type of tool as WireShark or does WireShark let you go even deeper than Fiddler can? Or is it just a preference? The reason why I ask is I took a Troy Hunt Pluralsight course on Hack Yourself security. And that's the only tool he basically used for security, and it's pretty crazy the types of things you could find just with that tool.
Ross: Yeah, Troy is cool. He's a developer and he's an Aussie, and I hang out with him at our conference every year. He's one of my favorite guys to chat with out there.
Ross: He's a developer that is eking in on the network side. Fiddler is much more geared toward HTTP type debugging. WireShark can do all of that and more. WireShark is going to look at your HTTP requests and it can actually analyze that information. Before I get into that, data networking is separated into layers, and up at the application layer, that's where we have a protocol like HTTP. That's where Fiddler is really good is an application layer network debugging tool. TCP operates at the transport layer, which is a lower layer in this model. And that transport layer, that builds up that session between the two endpoints. And that's where the port numbers are going to be chosen. So in the case of an HTTP server, we're going to have port 80 there. Then it can also manage and analyze what's happening at the network layer, and that's where we're going to have our source and destination IP addresses. It can look then, also, at Ethernet, our layer 2 connection, at the data link layer, and it can examine anything happening there. We get a much deeper analysis almost all the way down to the physical wiring itself.
Joe: The traffic you're capturing, are you able to play it back.
Ross: Yeah, absolutely. As a matter of fact, there's multiple tools you can use to collect the traffic. The traffic collection is actually a separate function from the packet analysis. So WireShark is the packet analyzer. And WireShark in itself, when you install it, will ask you if you want to install and activate WinPCap drivers or LibPCap drivers in the Linux world. And it'll install and activate those drivers. And those are the things that are actually doing the packet capture. What we can use on Linux systems, you can use something like TCP Dump, and that will silently capture traffic and store it in a file for you. You can later go retrieve that TCP Dump file, put it into WireShark, and then use WireShark as your analysis tool.
Joe: Let's just take it, dive down a little bit more, just for cover maybe some basic things that we haven't touched upon yet, we may have touched upon, but I just want to get some clarification on it. Your introduction to WireShark Pluralsight course actually goes over each of these points. At a high level, what is a packet?
Ross: A packet, in a general sense, a packet is just a chunk of information that's being sent from one device on the network to another device. Client to server, server to client, server to server.
Joe: When we're saying “packet capture,” all we're doing is we're recording that packet network activity that's occurring?
Ross: Correct. What's going to happen is the software is going to interface with the operating system's network stack. And the whole point of that is to take the software that's running, if we think of an HTTP server, that software, like Apache, or IIS, or, I'm sure there's other web servers out there, they're going to start up a software, but that software has to interface with the network interface card. And the network interface card has a job of sending out this packetized information in just the right size that the network can handle. The application and this network stack have to interface with each other. That begins at the application layer with the protocol, like HTTP, and then what'll happen is the network stack will then chunk up that data into manageable pieces and those manageable pieces, in a general sense, we call them packets. In a more specific sense, they have a different name at each layer of the networking model as we go down.
Joe: Okay. So when we talk about TCP, a packet make a team, TCP information within its packet, or whatever-
Ross: Well, yeah. In this case, with TCP, TCP actually, the information that TCP uses, we call that chunk of data, we call that a segment. And that segment is actually the envelope that holds the packet. The packet is the network layer component to that. That holds the IP addresses. The TCP segment will have the source and destination port number in it. The packet will have the source and destination IP address in it. And the frame will have the source and destination MAC addresses in it for Ethernet. And in networking, this is where one of the confusing pieces comes is that, we can talk about packet in a general sense, which means all of these things that I just said are packets. Or we can talk in a very specific sense if we're talking about TCP, it's not improper, necessarily, to say “TCP packet,” but it's more correct to say, “TCP segment.”
Joe: Another term I keep hearing a lot of also with networking and with WireShark is ARP. What is ARP and how does it operate?
Ross: ARP is the great calamity of Ethernet. (laughter) It's Ethernet's best feature and its worst enemy. And what ARP does, is ARP is a way of translating between an IP address and a MAC address. In order to send out the message onto the data network, to carry on with this idea of the segment carrying a packet, the packet carrying the frame, and then that frame being sent out onto the wire, the whole point of that is that at the segment, we need a source and destination port number. We choose those locally, if we're communication with HTTP, our destination port number is 80, and our source port number is some ephemeral, random port number. But when we send it down to the packet, the packet has to have addressing on it as well. That packet has a source and destination IP address. That source and destination IP address is going to get the packet from the local workstation through the long haul across the internet or across the network, to the server.
Then where ARP comes in is that we need to take that packet that's in it for the long haul and we need to send it out onto the local network. In order to do that, we take that packet with our source and destination IP addresses in it, and we stick it inside of a frame. That frame has to be addressed with a source MAC address and a destination MAC address. The source MAC address is easy to get. It's right on the workstation. But the destination MAC address for that frame is unknown, and it's based upon the IP address in the packet. If the destination IP address of the packet is local to the network, we're going to send out an ARP message to the local network and say, “Hey, who has the MAC address for IP address whatever it is. 19216810.1.” And then that device would then respond with the MAC address of that IP address.
The issue here is that when we send this message out, saying, “Who has this IP address?” Everybody on the Ethernet network gets that message. The ARP message here, what it's doing, is it's saying, “Well, based on my destination IP address, if it's local, I'm going to ARP that for that address locally. If the address is not local on the network, the ARP message will then be sent to the default gateway, and the default gateway will then resolve to a MAC address.” Then we send our frame out onto the wire. It reaches the router. The router will take that packet out of the frame, look at its destination IP address, and then put it into a new frame with new MAC addresses on it.
This is a little difficult to explain on the radio, but I have a course that covers this. It's actually in the CCNA series. CCNA may feel like it's a bit out of scope for some folks, but if you watch my introduction to networking on there, there's one that I actually walk through what it looks like if you're sending a Ping message from one device to another device across a network, and how ARP works.
Joe: I know you do go over this a little bit in your Introduction to WireShark course on Pluralsight, and you also have many networking courses, so I'll have a link to this in the show notes at www.testtalks.com/64. Ross, also I believe you have another section called Ping Analysis. Can you tell us a little bit more about what Ping analysis is?
Ross: It's a real nice protocol for beginners to work with because we can see instant results and it's real easy to pick out Ping messages in WireShark. When we're practicing, what I tell my beginner students in WireShark is that learning how to analyze Ping in a very basic sense and examining all the different fields and IP packet header, comparing them to the RFC, examining the ICMP fields, comparing those to the RFC, we can get a real easy way to understand how one protocol works on the internet. And if we understand how that one protocol works, we have a much better understanding of how lots more work.
Joe: Awesome. That's great advice. Along the same lines, are there any books or other courses you would recommend to someone who's trying to learn WireShark or networking?
Ross: Of course, I come at this from a different angle. I come at this from the networking side of the equation. In my opinion, to learn some of the networking protocols is a great idea. I have some great videos on the CCNA, the Certified Cisco Network Associate course. There's five or six of those courses out there. The listeners most likely do not need to listen to all of it, because a lot of it's demonstration, but a lot of the introductory stuff will show you how some of the protocols interact with each other, how the models are set up, and show you how some of the protocols actually work and has some really great animations and some demonstrations of how Ethernet actually works, how IP routing actually works, how TCP actually works. We can use this as the foundation, then, to build on everything else.
Joe: So far we talked about some benefits of using tools like WireShark and being concerned about the network. And one of them was performance, and another one was security. Are there any other areas, you think, testers, developers, need to be aware about when they're dealing with applications when they- network-wise? Or from a network perspective?
Ross: Yeah. I think it's an unbelievable bug troubleshooting tool. In those cases where a user is experiencing an unusual problem, and the application developer can't see what's happening and the user doesn't understand what's happening, but everybody involved recognizes that whatever event is actually happening, WireShark is probably the greatest tool for debugging. Even in encrypted sessions, in encrypted sessions, if you have access to the encryption keys, which, if you're troubleshooting an application issue, you should have, you would be able to just dump those encryption keys into WireShark. It can actually decrypt the traffic for you and give you unencrypted access to see what's happening on the network.
Joe: Awesome. Are there any other tools that you recommend people use when they're troubleshooting networks? Are there any modules within WireShark that you can add on to make it even more powerful?
Ross: There are. WireShark is an open-source community. First of all, I'm going to put a plug-in for the WireShark conference, called Shark Fest. It's a phenomenal, very small, very tight-knit group gathering of ridiculously intelligent folks, both on the software development side and on the networking side, and it's probably the greatest collaboration of those two groups that I have ever seen ever. They have a common language that they can speak with WireShark and it's pretty cool. WireShark itself is pretty fully capable. If there's a piece that you want to do more, let's say you're developing a brand new application that doesn't use a standard application layer protocol, and so you're literally writing the application layer protocol, what you could do is you could write your own dissector in WireShark, or find out if some other dev out there already wrote a dissector for something very similar that you can add onto. So you can actually build your own dissection modules for WireShark, or go out and find others. You don't really need those unless you're getting into some really specific stuff.
Other utilities that application developers should have a basic understanding of are things like NetStat on Windows and Unix workstations to see what ports are open and which ones are closed. And understand how to make the relationship between the open port session, the open TCP session, and the process ID number on the workstation. I think that's a really important utility for developers to have.
Joe: Okay, Ross. Before we go, is there one piece of actual advice you could give someone to improve their network testing efforts, and let us know the best way to find or contact you.
Ross: I would say it would be to learn how to successfully use the network testing protocols like Ping and Trace Route, or even Telnet, and I actually have a course coming out in a couple months. That course is going to cover exactly that. It's going to cover how we can use and analyze some of these network testing protocols and really see how powerful they are. Ping is a powerful tool, but we may not realize how powerful it actually is, and what we can all do with it. I think my one piece of advice would be to just dive into those data networking courses. I try to make them as easy as possible and engaging as possible to make it accessible to everyone. I certainly hope that it helps folks out there. It's my goal to bring this idea, this WireShark idea, to everybody. It's such a great utility and I really want everybody to use it.
If folks want to contact me, I'm on Twitter and there I'm @bagurdes, that's B-A-G-U-R-D-E-S. Or they can contact me via email at email@example.com.
How did Alyssa do?
If you rate this transcript 3 or below, Alyssa G will not work on your future orders