Over the last 12 months since we’ve installed our J9051A ZL Wireless Edge Services Module (WESM), we’ve had some intermittent issues with some of our wireless notebooks causing the WESM to freak out and run at 100% CPU and boot all of our wireless stations of the Wireless network. These notebooks work fine on the WLAN 99% of the time but every now and then they freak out and cause problems with the TKIP integrity check.
The notebooks that we’ve had trouble with have had either an Intel or Broadcom wireless NIC:
- Acer 3230 with WLAN: integrated Intel® PRO/Wireless 2200ABG
- HP 1100 Tablet with WLAN: Intel PRO/Wireless LAN 2100 3B Mini PCI
- Motion Computing Tablet with WLAN: Broadcom Wireless
We have many other 3230’s on the WLAN and one other Motion tablet that are identical to the machines that we’ve had issues with, but no other machines have caused the TKIP failure. We’ve updated drivers for the wireless NICs and installed all Windows updates and still haven’t been able to correct the problem?
When the TKIP failure occurs the CPU on the WESM hits 100%, see below, while it tries to perform the TKIP Countermeasures. TechDuke has a great explanation of the TKIP Message Integrity Check (MIC), and explains that when a wireless station fails the MIC, or Michael, hash check twice within 60 seconds then all wireless stations are booted off the wireless network for a minute and forced to reconnect/re-authenticate. Zack de la Rocha was way ahead of his time when he wrote Mic Check, he explains the MIC failure perfectly…
Rage Against The Machine – Mic Check (The Battle of Los Angeles (1999))
Oh Wait a minute now
Ha ha ha
Wait a Minute Now
The Diagnostic page on the WESM showing that the CPU had been running at 100%, as soon as we disabled Dial-in access in Active Directory for the offending notebook we broke the Radius authentication and the WESM went back to running as normal.
Wireless Edge Service Log
1: Feb 06 11:27:46 2009: %CC-4-TKIPCNTRMEASSTART: TKIP countermeasures started on wlan 1
2: Feb 06 11:28:15 2009: %MGMT-4-OTHERREQQUED: request queued in delegated requests
3: Feb 06 11:28:46 2009: %CC-4-TKIPCNTRMEASEND: TKIP countermeasures ended on wlan 1
4: Feb 06 11:28:47 2009: %CC-4-TKIPMICCHECKFAIL: TKIP message integrity check failed in frame on wlan 1
5: Feb 06 11:28:47 2009: %KERN-3-ERR: mic check failure <00-13-CE-04-FB-4A>. (pkt_len 360 prio: 0) rx: <2B-2B-02-DC-00-FF> calc: <A7-CE-2B-B8-45-C6>.
6: Feb 06 11:28:51 2009: %CC-4-TKIPMICCHECKFAIL: TKIP message integrity check failed in frame on wlan 1
Checking the message log on the WESM, we could identify which machine was failing the MIC and which WLAN they were connected to. Line 5 shows the MIC check failure and the MAC address of the offending machine.
We traced the offending MAC address back to its owner via the DHCP console and disabled Dial-in access for the computer and on the user account of its owner. This causes WLAN Radius authentication to fail the EAP-TLS auth because a valid certificate and dial-in access are required for access to that particular WLAN.
Currently we’re still running the original firmware wt.01.03 that came with the ZL module, but will update to wt.01.15 shortly and test these machines on the WLAN to see if the updated firmware can handle integrity check failure with a little more grace than the original firmware.