Aastra phones rebooting after DHCP lease changes

Working at a service provider for hundreds of VoIP customers with thousands of phones means you’re going to encounter problems that make you scratch your head. Every now and then one of those problems gets escalated to me and I’m left figuring out why something is happening. Sometimes that means running packet captures on all sorts of equipment, installing in-line network taps to grab traffic off the wire, or trudging through debug logs. Every now and then I get lucky, figure out the problem before we get started and look like a wizard. More often though, I’m left just as confused as everyone else.

This problem has come to me in a number of iterations, but it’s always intermittent, unexplained phone reboots that appear to have no rhyme or reason. Mid-call or sitting untouched, whatever the usage scenario, the customer was complaining that their phones were rebooting, affecting multiple Aastra models. The last time this was happening with noticeable frequency was several months back, and we thought we nabbed the problem when a provisioning error related to an access control list (ACL) on a switch was fixed. Phones stopped rebooting, the ticket was closed and the customer was happy.

Fast forward to the other day when the problem reared its ugly head again after multiple phones in their office rebooted unexpectedly. The ACL was still in place, so this was something new. What could it be this time?

Digging in

When I was working on this issue last time, I pushed out a temporary config to their phones that enabled remote syslogging of debug information. The idea was that the phones would spit out some sort of useful log line prior to their reboot, like memory being exhausted or an error code that would lead us to understand the problem. Around the same time as that, I discovered the ACL issue, so those logs were never needed. I decided to leave the syslogging in place as a precaution though, just in case unexplained reboots occurred in the near future. Now that the problem was happening again, I finally had information I hoped would lead me to an answer.

Determining any weird problem is never that simple however, and the phones didn’t log any information prior to their reboot, giving me nothing to go on. The Aastra phones offer very detailed log levels, ranging from just a trickle of debug information all the way up to a full on tsunami of completely indecipherable garbage. Apparently whatever was happening would not be exposed at the current level, but it did show us that more phones than expected had rebooted, but only once.

What now?

It turned out that our network guys had performed maintenance on their switch the night before, and while correlation is not causation, it was at least a good place to start investigating next. I caught a lucky break as looking at the switch logs gave us our next clue, DHCP address conflicts.

Normally when a DHCP client’s lease expires it just renews the lease for its current IP, and everything hums along happily. In this case though, you can see a client declining address after address. We can only assume from this that the client was denied renewing its current lease, and was unhappy with the IPs offered due to conflicts. Why would the switch’s DHCP agent do this though? They maintain a database of current bindings to prevent giving out IPs that are already leased. As it turns out though, due to incomplete configuration of DHCP on the switch, the reload after maintenance wiped the binding database clean, and it was clueless about existing leases. When a client’s lease timer expired and it sought a renewal, it was being given a new IP. Could this change cause Aastra phones to reboot?

Confirming the problem

The case was easy enough to test in our lab which already had one of the same model phones, an Aastra 57i, addressed via DHCP. The test plan was to decrease the lease time to the shortest interval (one minute), then deny leasing its IP, and finally remove the existing binding to cause a new lease to be generated.

After rebooting the test phone to make it aware of the shortened lease time, within a minute the switch said that it could no longer lease the IP it wanted and voila! The phone’s screen lit up, posted a quick message that its IP had changed, and it promptly rebooted! The problem was confirmed, the mystery is solved, but the result is awful. No poltergeists here, just an annoying network implementation. The customer also has Aastra 6731i phones, but one was not available in the lab to test against.

Solving it for good

In the end, the problem can be solved a couple different ways. The better long term solution is to centralize DHCP to dedicated DHCP servers, rather than relying on Cisco’s DHCP agents, and that is one of the long term goals for our VoIP network. The other method of attacking the problem is configuring a DHCP database agent, allowing the switch to store its DHCP binding database on a remote FTP or TFTP server, or even locally on flash.

This is the simplest way for us to ensure the binding database is maintained between reloads in the short term, meaning we can continue with switch maintenance without this kind of disruption to our clients. With any luck, maybe this will be the last I hear of rebooting phones!

One thought on “Aastra phones rebooting after DHCP lease changes

  1. Ahmed

    Thanks for sharing. will try out your suggestion and let you know if my similar issue is resolved. Sounds pretty promising from your comments. God bless!

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *