Posts

A likely problem in any multi-tenant network.

In any local area network, there is generally one and only one DHCP server. Dynamic Host Configuration Protocol (DHCP) is a network protocol that enables a server to automatically assign a unique IP address to any local network computer from a defined range of numbers (i.e., a scope or subnet) configured for a given network.

For example, when a computer is started on a local area network, the router typically acting as DHCP server, gives the newly started computer a unique ip address so it can access other network resources, and the internet as well. If you introduce a second DHCP server on a network, you wreak havoc on all computers trying get and ip address so they can access the network. With multiple DHCP servers, varying computers get various ip addresses, generally in unrelated subnets. Some computers will get a 192.168.1.X ip and other will get a 192.168.2.X, while others get 10.1.10.X, etc, etc. Each machine will get an ip based on the DHCP server that responds fastest. However, there is always one and only one gateway, and if your computers are on different subnets, they will never access the one and only gateway. The gateway brokers all network traffic.

We have a residential client that provides shared internet access to each tenant in a multi-tenant facility. Two of the tenants moving in, decided to add their own router to the network in order to provide for themselves wireless internet access to all the computers in their unit. the problem is that they connected the wrong network interface of their routers to the building network connection. This created multiple DHCP servers on the same network. So, when some residents many floors away went to access the internet, they were greeted with a page not found, only because some DHCP server had assigned an incorrect IP number (outside the range of their primary gateway).

We were notified that the internet was down, however, our internet monitoring software showed that the internet was up. We saw no problem with the internet connection. Our monitoring servers would have notified us notified us of the slightest outage. further investigation revealed that when we unplugged the main network switch from the internet router, we were being assigned an ip. That should never happen. Voila! A rogue DHCP server! Now, we just had to identify which of the 50 different units was the location of the rogue router. We isolated one of the router by trial and error, unplugging various connections, until ping response to the culprit ip failed. however, upon further diagnosis, we found a second rogue router. So now we had a DHCP router on 192.168.0.X and another on 192.168.2.X while the primary network was on 10.1.10.X!

At this point, we configured a diagnosis machine internal to their LAN with ip aliases on each subnet. We then accessed the router config page for the 192.168.0.1 router config page to disable DHCP and disable WIFI access on that subnet. We then did the same for 192.168.2.X, and “presto chango”, we finally had access to the true subnet of 10.1.10.X. We had each of the tenants reboot their computers, and access points, and their internet connectivity was restored.

We then notified all the tenants that connectivity was restored with the exception of two tenants. Once the two offending tenants contacted us, we re-configured their routers, re-enabled their wifi and saved the day.

The beauty of all this was that we managed to do it while 60 miles away from our office. Other than the initial plugging and unplugging data jacks, we were able to accomplish the balance of the diagnosis remotely.