PING! Not always what you think! – Meraki Wireless troubleshooting

I’m quite fond of the Meraki dashboard. I’ve seen firsthand how it can enable lean and low-skilled IT departments to manage more of their own networks themselves. The dashboard GUI makes it easy to find status and troubleshoot at a basic level, but it’s still important to actually understand what is going on under the hood.

Here’s an example. If you’ve ever seen the Meraki dashboard, you’ve probably seen the Ping tool on every client status page. Here, a Meraki AP is successfully Ping-ing my MacBook Pro:

pign-success-invalid-ip

Pretty straightforward. Ping the client, client responds, client is online and working, right?

If you have a Meraki Security Appliance, you may stumble across this little note on the Addressing & VLANs page:

ping-is-arp

Wait… Ping is based on ARP? What happened to ICMP?

We may have jumped to conclusions here. As it turns out, Meraki is not using ICMP like most of us would assume. Here’s an example of a few PCAP frames of that same Meraki AP ARP-Ping-ing my MacBook Pro:

directed-arp

Notice this is a directed-ARP; the Meraki AP (MAC 13:da:90) is sending an ARP request to the MBP (MAC 91:75:d8) rather than sending a broadcast. That is, the Meraki AP already knows the MBPs MAC address. But the ARP response tells the Meraki AP that the MBP is alive, and online – just like an ICMP Ping.

This brings about an interesting question. We network engineers often use Ping as a way to confirm that the network is working. A successful Ping means that routing, IP addressing and the physical path are all functioning correctly at layer 3. But if we’re doing a Ping at layer 2 with ARP – would we be wrongly assuming all is well when we get a response, just like with ICMP?

There is definitely some potential to make incorrect assumptions here. In fact, even though that screenshot above of the Meraki AP Ping-ing my MBP has a loss of 0%, at the time, my MBP had an incorrect IP address and was not Ping-able by other devices in the network (well, via ICMP at least). Here’s the PCAP’d ARP frames from the same time as that Ping output:

directed-arp-wrong-ip

Almost identical, except that both the ARP request and response are from 10.11.3.1, when the subnet is actually 10.11.30.0/24. However, the client is still responding, albeit at layer 2, and that’s good enough for the Meraki AP.

Now, I do think this is one of those things where the vendor has made an odd decision to label this as a Ping without being clear about what is actually being done, but it is after all our responsibility as the network engineer to know what we’re looking at. There are similar examples where traceroute can use UDP or ICMP, depending on the OS, and now you know that sometimes Ping is ARP instead of ICMP.

Here’s the relevant documentation:

Meraki Ping Tool

ARGGHH ARRRRRRRP!!!

I run into this issue several times a year with a certain local DSL/Fiber service provider and ASA firewalls. Sometimes it consistently occurs after an outage, others it is seemingly random. This is the relevant debug ARP input (modified for confidentiality):

?arp-req: generating request for 16.197.160.254 at interface Provider1
arp-req: request for 16.197.160.254 still pending
?arp-req: generating request for 16.197.160.254 at interface Provider1
arp-req: request for 16.197.160.254 still pending
?arp-req: generating request for 216.197.160.254 at interface Provider1
arp-req: request for 16.197.160.254 still pending

What makes this a REALLY bad thing is that 16.197.160.254 is the ASA’s default gateway. No internet access for you.

Unfortunately, I have little (no) visibility into the provider settings, but I can hazard a guess that there is some sort of spoofing protection in place. Sometimes a reboot of the ISP modem will fix the problem, but often times we have to call the ISP and, while trying to explain ARP to tier 1 support should NOT be too difficult, it is typically an exercise in frustration. Not surprising, often we are asked to plug a laptop directly into the modem with the ASA’s static IP programmed on its NIC, which works, and causes the ISP to cheer “NOT OUR PROBLEM!” Of course, if I plug a laptop directly into the Provider1 interface on the ASA, it gets an ARP reply right away and communication to the “gateway” also works just fine. Ultimately, I have not been able to find an easy, repeatable fix from the side I have control over (the ASA), but sometimes the ISP clears an ARP table to solve the problem.

Something similar occurs from time to time with this ISP with NAT/PAT IP addresses that are NOT the ASA’s interface address. In this case, we can PCAP traffic leaving the ASA with the appropriate IP and MAC towards the ISP, but the ISP will never forward the return traffic to the ASA – the gateway doesn’t create an ARP entry for that IP.

In this case, an easy fix is to temporarily change the ASA interface IP to match the NAT IP. This causes the ASA to generate a gratuitous ARP and suddenly the return traffic gets delivered.

This is certainly different from the first case, where no amount of restarting or GARP seems to convince the ISP gateway to reply to the ASA ARP request for the gateway’s MAC, but I suspect the cause may be related.

I’m currently waiting for the ISP to determine if an engineer who actually knows how to log into the gateway router and look at an ARP table does indeed exist, or whether I’m more likely to watch a unicorn run a red light on my commute home. In the  meantime, if anyone can explain what this ISP is doing to cause this behaviour, I’d love to hear it.

 

Wireless Professionals Compensation Survey

Today, Wednesday, November 30th there will be a one-day independent survey to gather information on the current state of compensation in the Wireless LAN community. We respectfully request your participation in this 90-second survey. The current results will be available to all survey takers as they complete the survey for instant feedback. Later, the complete results will freely reported back the entire community.
Thank you for your support and participation!
http://surveymonkey.com/r/wlccs2016 – Wireless LAN Community Compensation Benchmark

CWNE #190

cwne-brennan-martinI woke to some exciting news last week. An email from CWNP informing me that my CWNE application had been approved and that I am now CWNE #190!

This is a huge honor, and I am humbled to be part of the CWNE community. While I like to think that I have picked up some skills over the past decade, I owe a great deal of thanks to people whom I have met over the past year  who have mentored and encouraged me to engage in the Wi-Fi community.

Thanks to Keith ParsonsBlake KroneDavid WestcottPeter MackenzieMartin Ericson for sharing your insights and invaluable knowledge!

Onwards…

 

Cisco Live! recap – Tuesday

I was pretty exhausted Tuesday morning. With two young kids at home, I run on a minimum of sleep already, and all of the activity of CLUS pushed me over the edge very quickly. When the alarm went off Tuesday AM, I had to make a command decision – I knew I was not going to be able to keep up the pace set at the start without recharging my batteries, so I reset my alarm and went back to sleep.

Yep, I was a rookie at CLUS, this year, and I’ll try to be better prepared next time. Unfortunately, that means I had to miss my morning session “Fog Computing at the World’s Largest Copper Mine”. Which it too bad, because I spend a lot of time in mining and we have been looking into Fog Computing a lot lately.

The presentation is missing from the link above, but thankfully the session video is there. Funny that the first thing presenter Francisco Soto says is “Thanks for waking up early for this.” Now I feel terrible.

RowellRobMitch, and I had a meeting setup before lunch with Jerry Olla from Ekahau. I was stoked to be meeting some of the Ekahau crew! Jerry sprung for coffee, and that coupled with a little extra beauty rest meant I was having a pretty good start to day #2! Thanks Jerry!

Jerry answered some of our nagging questions about using Ekahau Site Survey. ESS is indispensable in the WLAN design stage for helping to develop real requirements and turn them into actual channel plans, power settings, antenna coverages. Without a tool like this, many people are just “eyeballing” WLAN design. Jerry also showed us a neat little project he had been working on using an ODROID C2 to build a custom iPerf server.

Jerry is awesome. We may have thought that there was some potential to abuse our new insider relationship with Ekahau to get our hands on some sweet, sweet swag:

Buoyed by the warm reception from Jerry, and our unrequited love for Ekahau, we weren’t ready to part ways just yet, so we headed over to WoS to meet the rest of the Ekahau crew!

Jussi Kiviniemi is well known in the Wi-Fi community, and an excellent representative of the Ekahau team. I was excited to meet him in person he is all class! The whole Ekahau crew just has the right approach to being part of the community. For example, Hannele managed to dig out some primo swag in return for some testimonials:

Maybe I was excited most of all for the legendary Finnish candy. I wear my Ekahau Polo to work all the time.

I may have gotten a little carried away:

Also this happened. No explanations will be forthcoming.

Big Kudos to Ekahau for their continued community support and participation. Also a well deserved shout out to Robert Bartz!

There were TONS of other badass vendors to meet!

Check out Smitty and the team at Acceltex. They’ve got a nice spread of antennas and even better some well thought out brackets and mounts for their own gear and the Cisco APs with internal and external antennas as well. Meru Mitch swears by these guys.

We are all pretty pumped this summer about the new Aircheck G2 from NETSCOUT, and meeting their social media manager Kendall was another highlight.Robb was doing his best to see if a Aircheck G2 could find it’s way to his home somehow (Kendall, he ended up paying good money for one – so you won that round!). I think Kendall probably saw more of us at CLUS than anyone would have wanted, but if anyone could handle these hosers, it was her.

That wouldn’t be the last Kendall saw of us…

Ok I did also squeeze in a session Tuesday afternoon. BRKEWN-3014 “Best practices to deploy high-availability in Wireless LAN Architectures with Patrick Croak.

  • Patrick had some great advice regarding minimizing the number of SSIDs:

BRKEWN-3014.pdf (page 17 of 120) Preview, Today at 9.28.03 PM

  • And a good reminder about client based CCI:

Check out the presentation link above for all of the good stuff.

I was looking forward to the evening events on Tuesday, namely… the CCIE Party! Steve, Mitch and I were off to the Hard Rock Hotel for a beach party.

With Food, Music and Beer taken care of, it was back to meeting some familiar names.


Brian McGahan with INE, whose videos are in part responsible for me qualifying to attend this party. Brian was pumped about my tramp stamp.

Amy Renee is a well known ginger CCIE with a well known sparkly bat. My kids are gingers, so we cool.

I also had the pleasure of meeting and shooting the breeze with Fish herself. I’m confident that I learned something just standing near her.

I learned that Tom Hollingsworth is also a fan of the CCIE tramp stamp, and I was glad to meet one of the minds behind Tech Field Day. I’d love to participate in a TFD or better yet Mobility Field Day, so fingers crossed!

The biggest surprise though, would come from closer to home than expected, when I had the pleasure of meeting another fellow Canuck and Wifi guru @wirelessstew!
I was blown away when Stew said that, not only did he know who I was from my (very new) online presence, but that he had hoped to meet me and find out more about me and my mobility experience. Stew would turn out to be the final, and likely most influential of the hosers group of Canadians (Stew, Steve, and myself), and honorary Canadians, Robb and Meru Mitch.

Us Canadians were a bad influence on Mitch:

But the night was still young so we decided to brave the Vegas heat and check out some more of the CLUS festivities. It was only logical that we take our Canadianized Meru Mitch to the Cisco Canada party. Beer was once againMore beer, music and food, although we never did figure this part out:

So now I have to figure out how to introduce Mitch to poutine…

TL;DR: Tuesday was the meet/make friends/network day. CLUS is an amazing opportunity to make connections and I have already benefitted from my new connections.

Also, I was glad I slept in.

Stay tuned, more shenanigans came Wednesday.

Cisco Live! recap – Monday

I had a session booked at 8am. Breakfast started at 7 and with a 30 minute walk to the conference centre, I needed to get up at 6. So much for getting some rest while my kids weren’t waking me up!

BRKEWN-2000 “Design and Deployment of Wireless LANs for real time applications” with Jerome Henry! 

This is where I would finally meet the illustrious Rowell Dionicio of Packet 6 and the Clear to Send podcast; and my Canadian brother from another country, Meru Mitch! (that’s right, I know THE Meru Mitch!). Check out Mitch’s recap of CLUS 2016 as well.

Here are a couple of things that stuck with me from Jerome‘s session:

  • People use only one real time application at a time. You have a FaceTime chat, then hang up and go onto something else. It’s pretty impractical to FaceTime and have a VoIP call at the same time. It’s safe to assume that one real time app will be the only bandwidth need while in use.
  • How to do the math. Some good examples of how to actually calculate the bandwidth need in a cell.
  • What is my client device’s max EIRP. Some more great examples. This info isn’t often published by the manufacturers. An iPhone 6s for example, has a worst max EIRP in the 5 GHz band of 10.3 dBm.
  • Here’s a big one: 802.11 devices decide which transmission rate to use based on the signal strength of transmissions received from the AP

    • Think about this one. This goes towards the concept of matching transmit power.

These four points are covered in the first 30 pages out of 191; so I highly recommend you grab the PDF and have a read through. Lots of excellent concepts explained by Jerome.
IMG_6448Next was off to the opening keynote with the honorary hosers, Hub Holster Robb and Meru Mitch. Lasers, dancers, and Your Time is Now – the theme for CLUS.

28,000 attendees, and don’t forget about Cisco DNA!

IMG_6449

 

With a little time after the keynote before the next session, we couldn’t resist heading to the World of Solutions (WoS).

We caught up with Matt, and I had a specific stop in mind.  We quickly located the giant sign in the distance pointing the direction to the Cisco certification lounge, and made our way over. I admit I couldn’t resist leaving my compadres at the general entrance when I saw the dedicated CCIE entrance! Being my first CLUS, I couldn’t resist trying out the perks.

Note for next year: the CCIE entrance is like the side door. We ended up at the exact same place, ha! Oh well, I got my fancy CCIE ribbon for my badge, but my fingers were crossed for something else… YES. They were in the back of the lounge!

I got my first CLUS tramp stamp. Now I was really on my way to being a CLUS veteran! Thanks to Robb,  Mitch and Steve for putting up with me and generally encouraging the nerdery. Moving on.

We picked up our CLUS 2016 T-shirts and got our picture taken with Elvis:

IMG_6457
@Robb_404, Elvis, @mattouellette, yours truly, and #MeruMitch

We hung around WoS a little longer and started accumulating various vendor swag. Little did I know just how much swag we would end up with. Soon it was off to another session.

BRKRST-2515 “QoS Design and Deployment for Wireless LANs” with Robert Barton

Having recently completed the CWNP CWAP course, much of this was very familiar to me so thankfully it wasn’t hard to follow. QoS is always a complex topic.

Here are a couple of things that stuck with me from Robert’s session:

  • Do not use the Wired QoS Protocol field. This caught my attention because it was a departure from Cisco’s earlier best practices which I was familiar with. Admittedly, it had been a few years since I had really reviewed this topic (see my post When a 6 means 5 – Cisco WLAN QoS), so this stood out for me.
  • A lot of work on QoS has been done in WLC code recently. Time for me to start reading.
  • Introducing Fastlane. The fruits of the Apple/Cisco partnership are ripening. Big benefits to roaming and QoS for Apple devices, often with no admin intervention or config required.
  • EasyQoS. This should be a BIG deal. See @Cantechit’s post here for an excellent overview.

This session was a real brain session, so I enjoyed the change of pace in the next session:

BRKEWN-2019 – “7 Ways to fail as a Wireless Expert” with Steven Heinsius (contender for best Twitter profile pic, btw).

This was a fun session, and there were many great takeaways. Fun = engaging, and it did a great job of reinforcing some concepts we all know well, in a way that might help us explain it to our clients and users.

IMG_6460
You really can’t argue with that.
  • CnHuXXHUsAAaAFP.jpg-largeChannels 1,6 and 11 folks.
  • Get moving to 5 GHz.
  • Don’t use max power.
  • Cisco says use RRM. But Cisco also tells us to tune RRM. Max power should be limited to 17 dBm and min to 5 dBm.
  • Everyone loves Bad-Fi.com.
  • Go take a look at Wigle.netYou can learn a lot from this site. No more TKIP/WPA1 or WEP. Even WPA2-Passphrase is not appropriate for the enterprise.
  • No, 802.11ac is not going to give you better that 1 Gbps throughput. See Andrew von Nagy’s webinar here.
  • Look for the “Best Practices” slides in Steven’s Presentation. Good rules of thumb to keep handy.

This was a great way to end day #1. Next was back to WoS for more swag! There were so many vendors to see, and even better, the food and beer was everywhere we looked. We paused at a table at one point and were joined by a very friendly and unassuming woman looking for a spot to put down her plate.  At luck would have it, we met Cisco’s social media manager Laura Babbili!
IMG_6468

How cool is that. Laura is part of the top brass of Cisco’s social media team, including @CiscoLive. We picked her brain about the inner workings of the Cisco Live twitter team, but I’m sure it’s a complete coincidence that we had several twitter pictures on the big screen at the closing keynote…

 

On our way out of WoS Robb, Meru Mitch, Steve, and I took our commemorative photo with the Cisco Live! sign:

IMG_6470.JPG

But the day doesn’t end there! (See why I am breaking these posts up by day??)

Robb and I met back up with Rowell at a lounge in the Cosmo hotel, where I had the pleasure of meeting Ryan Adzima and Richard Macintosh, two Wi-Fi wranglers who are also well known in the community. I admit I was pumped when Richard saw my twitter handle and pointed out that he had read some of my posts. It was cool to know that, even if I am embarrassing myself online, I am getting a little bit of attention in the community I am hoping to be a part of.

I’m aiming to recap the rest of the week soon! Stay tuned.

my NETSCOUT Aircheck G2 calls me slow :(

As we’re all getting used to our nifty new gadget, we’re bound to scratch our heads over a few things. My honorary Canadian friend MeruMitch and I were noting a particular output from the G2 which seemed odd.

I’m connected to my AP at a data rate of 867Mbps:

wifisignal data rate
Wifi Signal is the shizzle

And a quick Wifi Perf says I can push 115 Mbps of actual throughput (wireless to wireless over one AP):

FullSizeRender

But the G2 is adamant that my connection rate is 24 Mbps:

Screenshot0003
Um… Wha?

Well let’s think about this for a minute…

The G2 shows a “PHY Data Rate” of 780 Mbps connecting itself to that AP:

Screenshot0000.png

And if we check the G2 manual:


Ok, so that makes sense. This is what we’re expecting to see, and if you watch this field, it fluctuates a bit in the the 780-867 Mbps range.

So what then is the “Connection Rate” which is stuck at 24Mbps? Again from the user guide:

Interesting! Now I bet some of you are thinking that 24Mbps must be a basic rate in this WLAN; and you would be correct!


The key here is that the G2 is looking at the data rate of individual frames, no averages, and showing us only the last frame’s rate.

So let’s grab a basic pcap and take a look. Forgive me, this pcap is of my iPhone 6S+ rather than the MBP that I started this post with, but the relevant bits are the same. The iPhone was associated to an AP with a maximum data rate of 300 Mbps, and the G2 stayed fixed on the connection rate of 24 Mbps.

This pcap is of my iPhone doing a speed test, and I’ve filtered it to only show frames from the client, based on that definition from the G2 guide above.

 

Most of the data, by Bytes, is in data packets:

TopProtobyBytes

There are a total of 22,508 Bytes sent by my iPhone, and 8,510 Bytes are data packets using the 300 Mbps data rate (38% of the captured Bytes, matching the pic above). 33.5% of the Bytes are sent at 24Mbps… surely the G2 should have, even briefly, shown something at 300Mbps?

Packet Bytes

But the G2 isn’t looking at Bytes… it’s looking frame by frame! Look at the data rates of each frame:

packets

Better yet, let’s change the previous output to see the numbers for each frame:

Packet Numbers

Ok well my iPhone sent 87 frames at 300 Mbps, and 300 at 24Mbps, clearly the majority of frames. This is a good explanation why the G2 appears to say that the connection rate is always 24Mbps.

It goes to show how important it is to know how to interpret what our tools are telling us. Note that only data frames in my capture were sent at the 300Mbps data rate. The control frames and some of the data frames are all sent at 12/24Mbps – our basic rates. Which is a nice plug for CWAP!

Now I’m not really sure how useful the “connection rate” field really is in its current form. Showing the last frame rate seems less useful than showing the “PHY data rate” that the G2 uses when it connects to an AP itself.

I’d love to hear what anyone else thinks of the “connection rate” field…