Whenever we speak about or we track connection states, everything is available for us, we are talking about 150 states that we are aware of during a process of a particular wireless client getting onto the network and thereafter. But one of the key few metrics that you want to see, and this helps you explain the SLEs better as well, are these connection states. What happens when a client comes onto the network?
PRESENTER 1: We, whenever we speak about or we track connection states, everything is available for us, we are talking about 150 states that we are aware of during a process of a particular wireless client getting onto the network and thereafter. But one of the key few metrics that you want to see, and this helps you explain the SLEs better as well, are these connection states. What happens when a client comes onto the network?
The first thing is it goes to the association. So there’s something called as the open association. If the access point and the client don’t agree upon parameters, it doesn’t pass that open association or the open part. But for the most part, it’s just it’s a client requesting to join an SSID. That’s the part of the association process.
The second part is the authentication. So based on the kind of network you have, you will either be a RADIUS-based SSID, which is similar to your dot1x on the wired ports. You have a dot1x association, the same [INAUDIBLE] the RADIUS server responds back with an access accept or an access reject based on whether you are allowed to the network or not.
Or it could be PSK. It doesn’t have to transact with the RADIUS server. It is a key on the AP itself. And you’re able to put the right key in. As long as the key cracks up to the right answer, you’re getting in. Captive Portal, today all of you guys did an authentication with the portal popping up as a guest SSID. And that’s one of the ways.
All of those different ways comprise of authentication. Now, once it goes through the authentication, the next process is authorization as well where you are provided the right kind of access into the network based on whether you have a VLAN, whether you have an ACL associated to it. That’s the authorization step.
And the next part, once you are– so it’s important to understand that the authentication when it happens, except the Captive Portal, the rest of the authentication is layer 2 authentication. As in, until you pass that step, you’re not given an IP address. Once you pass the authentication, you’ll be moving to the process of getting a DHCP.
From then on, it’s pretty much like the wired network. Yeah, you’re getting an IP. You send a DHCP discover. Eventually you’ll get an offer. And you’ll get an IP address as well. Once that is done, once you get your default gateway, the next thing you would do is ARP for your default gateway and do a DNS requests as well and make sure that that is successful.
At this point, from the point when a client came in to associate to the network until the point where it got a DNS success, this is what we use to calculate our parameter called the time to connect. From the point where it said, I am coming into the network, all the way to the point where it’s, I’m ready to transmit traffic, this whole process, this whole time period taken is time to connect.
Any point in this process, if there is a failure, you will be able to see that successful connects fail. And it’s usually in the first two steps. It’s important to understand the state machine, because it’s the same state machine that AP follows too. AP, as soon as it comes up, it tries to get an IP address, call home, and ARP for default gateway, get DNS out, connect to the cloud.
And if you have seen your APs in disconnected state, all the LED blinks were in different patterns also tell you the same story. It is very important for us to understand the state machine. Now that we have seen the state machine, let’s go see this in action real quick.
So every single time you want to see what is happening on a client, there are three different pages. One is what gives you a ton of information is– one is the client insights. The other is the AP insights. And the last one is the site insights itself. So anything that has an insight attached to it, it is important. It gives you a lot more information on that particular device.
So every single time you click on a client– and let me go through one more time. So you go click on any client that is available on the network, I chose the first one, and then click on client insights. You will be provided with a tab of every single client event that has happened on the network.
For example, if you take this particular association in place, so it goes through the process of associating an authorization to the network. This is probably a roam. That’s why it says reassociation. So let’s go to this one. So you go through the process of authorization and association. Both of them have been successful.
You see a lot more detail in here as well, what happened during the process, what band it belongs to, what is the channel on which it is associated to. And right here, you see the AP it is associated to. And the next process is the DHCP success.
Here is the term where you see– here’s the timelines at which these happened. And this is where the time– why the DHCP server was responding slow is all referred to from. So it’s 03626 and 03907. It was 300 milliseconds for the DHCP to respond back and give you an IP address. If it was slower, you would see a good full 1 or 2 seconds. And that shows you a slow DHCP server response. And then, you see a gateway success and DNS success.
All of these four steps right here, association, authorization, DHCP success, ARP success, and DNS success are the state machines that you should be seeing for any particular client, which is joining the network. And anything that’s not, you will see all of them in the bad segment.
And you’ll have packet captures for those. If there’s a OT failure, if there is any other kinds of DHCP failure, there is DHCP abort. DHCP denied, if your client is coming in asking for a subnet that that particular VLAN does not belong to, then you see a DHCP denied.
So this is a great place for you to come and see and troubleshoot every single time. But this state machine is very important. And we have it all set.
We have a lot more events that you can see. We don’t display all the 150 states here. But these are the different events that could occur during the process of client coming into the network. You don’t have to go through– you don’t have to understand all of them today. But you need to definitely know these five states that we spoke about.
Any questions on the state machine itself and where to find them on the network? Go ahead.
AUDIENCE: I noticed that some failures don’t have a [INAUDIBLE]. How do we– which ones don’t reject [INAUDIBLE]?
PRESENTER 1: It depends. If you are doing a DHCP denied, we don’t need a packet capture. We’re telling you you’re actually asking– we, in the description phase, on the right-hand side, we’ll tell you that you asked for a wrong subnet. So it depends upon the kind of failures.
If there is a failure that requires further troubleshooting, but where the reasoning is not enough, sometimes we just say, oh, there was an authorization failure. But you want to know at what stage of the authorization it failed. It goes through the whole [INAUDIBLE] process, four-way handshake, M1, M2, M3, M4, all of these steps was M3 failing, M2 failing. If you need further such details, that’s when you download a packet capture. Otherwise, we also tell you the reasoning.
One of the key things that I think is very important in that state machine is the RSSI at which it’s transacting. Some of the times, it’s just transacting at a very low RSSI, Neg 85, Neg 86, it missed a whole [INAUDIBLE] packet, and it failed a transaction, those are things that you could infer as well as a part of troubleshooting.
The reason that not all authorization failures may not have a packet capture is we throttle that as well, meaning let’s say we see a series. Some guy has a bad password. He’s going to continue to fail until the password is changed on that system forever. And that means every time, and he connects to an AP, attempts to connect to an AP almost tens of times within a minute. Because it only takes milliseconds to fail a password transaction.
PRESENTER 2: So if every minute I have 10 packet captures for every other authorization failure for a problem that we already know what is going on, it’s a waste of our bandwidth and waste of our storage. We actually, if we see a series of authorization failures, we don’t store a packet capture every time. We don’t– so that’s why you may see, hey, there are some red packets here that don’t have a packet capture. That means you either go down or up and see somewhere in the sequence there is another package capture waiting.
PRESENTER 1: The Captive Portal is layer 3 authentication. That means there is a small reversing in the order. The authentication happens after you received an IP address. So the client comes in. It’s happily given an IP address. It then goes through an authentication, where the Captive Portal pops up. For you to reach any portal, you need to have an IP, and hence that. So please remember, there’s a small order change when you’re talking about captive portals. It’s a layer 3 OT versus the more secure layer 2 OT.
So apart from that, everything else remains the same. And this is very key for us to remember in terms of explaining SLEs as to how we came up with that number as well as troubleshooting clients.