In this article, Rory Conaway describes a nightmare of a project and the lessons learned.
Many WISPs do one-off wireless projects. As consultants, we get referrals for many projects from a variety of sources. I have partnered or been lead on many projects, some of which I’ve written about here, others that have been profiled in magazine articles, and some of which I’d rather put in the category of “what was I thinking”. These projects are a great way to add to the bottom line, build reputation and experience, work with new products, and get away from users who think their sole and final purpose in life is to send me speedtest screen shots 6 times a day. When someone refers a project to us, we do everything possible to make that referring company or client look good on the project. However, sometimes they are engineered incorrectly and there is no way in all good conscience to complete the installation. Even worse, if you complete the installation and it doesn’t work or work well, you get blamed. This damages your reputation with the client or the referring company, meaning no more referrals. It’s a total lose-lose situation that reminds of the decision you have to make when the doctor asks you to choose between a laxative and a suppository.
Normally when a link has to go 800 feet outdoor, it’s a slam dunk. I always assume that if I can hit a golf ball farther than my connection, pretty much any radio is going to work, especially if I’m using 23dBi panel antennas. The project I will discuss in this article involved Vegas, partying, family visit, and oh yea, a couple of hours plugging in some high-end indoor APs for the link as a side project. However, when the scouts fail to mention that the play coach is Buddy Ryan (a very, very sad excuse for an NFL coach), Shaq is guarding the hoop and you have to play with cement gym shoes, that’s going to be a problem (Yea, I know, two different sports, I just couldn’t think of any basketball coaches as bad as Buddy). My mistake was not asking for that information before we got there. The assumption was that the SME (and he is highly experienced and respected in his company and with his clients in the networking field) for the manufacturer had experience in this type of deployment, the equipment had been used and tested for this application, did proper due diligence and a site survey, and everyone actually had the customer’s best interest in mind. This was mistake that I won’t ever let happen again. “Trust but Verify” is currently getting painted all over the walls in my office for future projects so this never occurs again.
In this project, we were asked to install a PTP link to replace some “failed” (a loose term that apparently means different things to different people) Cisco outdoor radios. The plan, as designed by the SME, was to use indoor APs from another manufacturer with 100 feet of coax, lightning protectors, and 23dBi outdoor antennas. Since the only cable the manufacturer sells for this type of deployment that is FCC approved is LMR-600, cable loss at 5.8GHz isn’t a big problem. Times Microwave has the loss at 7.3dBi at 5.8GHz. The APs were 17dBi maximum output. You would think that would be simple enough. As it turns out, next time someone asks you to install equipment that an SME with limited or no outdoor RF experience designed without ever going onsite, doing an RF site survey, let alone any analysis of why the previous equipment wasn’t working or having your chest hair ripped off with a pair of tweezers one hair at a time, take the second, less painful option. And triple your budgeted time on the on project since you are going to have to do all the work that the SME should have done before selling thousands of dollars of equipment. We budgeted one day. We should have just rented a house and had our licenses changed to Nevada residency.
Day 1, we get up there to start the install. First problem that nobody mentions, the network administrator at the facility is on vacation. The desktop support guy has no idea what’s going on but he does have keys, points us to where everything is installed, and fails to mention that he leaving right on the dot at quitting time with no notice to us. This little detail resulted in enough extra roof time to cook a turkey.
However, this was a great setup with boxes of equipment, a nice room to stage everything with tables, a water fountain in the next room and a private bathroom 3 feet away. We are styling and I’m already thinking through my BlackJack strategy for later that night. After inventorying everything, we go up to the roof, see the old equipment up there and notice that the 75-ohm cable that the Cisco radios used are pulled through a 1.5 inch conduit with 180 bend at the top with the cap. Then we looked at the LMR-600 cable with pre-installed N-connectors that the manufacturer supplied and the 2 dual-polarity antennas that we need to connect. Then we looked at the 1.5 inch pipe again. And let me not forget to mention that there are 90 degree bends in the pipe along the way. Since we don’t have a proctologist on staff, there was no way a single LMR-600 cable was going through that pipe, let alone 4 with connectors already on them. LMR-400 would also be pretty close to impossible. Since the run is probably about 80 feet, LMR-195 was out of the question too, due to signal loss. The only option left was trying to use LMR-240 which we don’t carry and has a signal loss of 20.4dB at 100 feet. Not only is that borderline, but we had to put in lightning protectors at the top, meaning our loss was going to be about 16-18dBi plus connectors. I estimated a total loss of about 20dBi.
Since we couldn’t install the new radios and we were told everything was down and this was a critical emergency, we decided to program a pair of NS5Ms to at least establish a temporary link until the LMR-240 came in and we could do the job. That’s when the old adage of, “no good deed every goes unpunished” comes into play. The hospital cabler had his guys put in some Cat-5 while we pulled the Cisco equipment down and swapped in the NSM5’s. When we tried to come down off the roof, our contact in the IT department, remember that I mentioned that he is as punctual as an Italian train schedule under Mussolini, was gone and he didn’t notify maintenance that we were on the roof. It was about 108 degrees ambient and probably closer to 125 on that roof. That’s when we discovered that maintenance locked the roof exit door behind us and our IT contact left without so much as a heads-up that two schmucks were cooking up there. Since now we couldn’t test the link because we had no access to the data center and we had to order LMR-240 anyway, we went back home to Phoenix that night. No Blackjack and I was more than a little miffed, as well as sunburned.
When the cabling and connectors came in, we headed back up to Vegas to try and finish while again estimating one day. That’s when we ran into the fact that the AP manufacturer had never tested these radios in a PTP stand-alone WDS environment. In fact, the APs were rebooting every 30 seconds when we first plugged them in because they didn’t have a DHCP address from a central controller. Apparently stand-alone means different things to different manufacturers also. After wasting an hour of playing who’s faster, us or the reboot function, we contacted tech support. 4 hours and 2 manufacturer support technicians later, neither of whom had ever worked on the radios, we at least got in them to stop the rebooting. We also discovered one of the four APs was bad. After wasting another hour trying to get the WDS link up, we tried to get tech support back but it was too late in the day. We were stuck in Vegas for another day. Not a bad place to be if you think you will finish the next day. As I was to learn the next day, that was not to be the case.
The next morning, we try to set the APs up in WDS mode. After wasting another hour realizing that this firmware was written by engineers that probably hadn’t seen a GUI from a competitor since 2003, we got the manufacturer’s technical support team back on the phone. Two more new support technicians, neither of who have set these up in a PTP link as stand-alone devices without a centralized controller, worked on this with us for another 2 plus hours with no luck. Here is a clue for manufacturers: if your wireless technical support staff, supposedly the ones most qualified and trained on your product and working with field installers who have hundreds or thousands of installations under their belt across multiple manufacturers can’t make 2 APs connect together on a table right next to each other with WDS for several hours, “YOUR GUI INTERFACE IS PATHETIC”. By lunch time, we both gave up and they said goodbye. 10 minutes after hanging up with us, the magical WDS link mysteriously comes up. What I glossed over was the fact that in addition to the 30-second reset problem that we originally experienced, a couple times the radios would lock up on a software reboot and had to be power cycled. Even some change in a field would totally lock up the AP and we had to reset it back to factory settings and start over. Oh yea, this is the product I want to put up in a mission critical environment. 3 wasted days on this project and I haven’t even covered how our temporary Nanostation M5s were doing.
While all this was going on, the person who uses this link comes in and shares the actual user experience data that we should have gotten from the beginning. It’s amazing to me that nobody ever listens to the end user. Apparently the Cisco radios never went down, they just had problems handling the VoIP traffic. They worked fairly well for data but there were delays. In addition, the temporary NS5M’s we put up were garbling the phone calls, although the data transport seemed to be faster. Ahh, the illusive site survey that should have been done from the beginning now comes back and bites me.
We start checking things out with the NSM5’s set for the 5.8GHz band. It’s basically toast, even at 800 feet. The noise level was at 52dBm up to 5.8GHz and it’s not much better up to the top of the band. Even though the NS5M’s are connecting perfectly (we adjust power levels to get receiver signal levels from 40dBm to 55dBm) with 98% CCQ levels and 130Mbps on both polarities, VoIP calls were still garbled. Random testing with various channel sizes and even invoking the mighty AirMax couldn’t get rid of the garbling. It was time to start hanging out in the DFS channels. Since we were only about 4 miles from the airport, 5.6GHz wasn’t an option. I’ve seen issues with DFS alerts as far down as 5.4GHz when you are that close so hanging in 5.2-5.3GHz was going to be the only option. A check with AirView showed no noise down there so things were looking up. I’m not a big fan of using DFS channels for any VoIP mission critical applications for obvious reasons, but when your options are this and none, you take what you can get.
Setting the radios up in that band seemed to fix the problem for a short time. Then ping times started jumping again along with the voice garbling. We also saw a single polarity randomly drop in modulation periodically. Checking with AirView again showed nothing. One note here, turn off all multi-casting option in the Nanostation M5’s because that will cause random ping time issues. After running different channels for multiple minutes, things seemed to settle down and the voice quality seemed stable again.
Getting back to the installation, when we made the decision to pull the plug on the project, the APs still hadn’t connected. The techs on the phone understood this wasn’t the best solution and they agreed we needed to find another. It was also 2:00pm on day 3 and we hadn’t even started the installation. We told the client that the best option was a 24GHz AirFiber solution, confirmed that with the SME who designed this project and verbally agreed with us, and went home. There was no way I was going to put our name on a project with this AP in a high-noise, mission critical application. It apparently hadn’t been done before with this equipment, we barely got it functioning, the firmware clearly had bugs, the EIRP at the antenna was going to be too low, and there were unknown interference issues we think we might have gotten around, a least while were there. This was enough information to make me pucker hard if this thing went up.
Apparently though, that isn’t the end of the story. The SME simply doesn’t want to believe that this won’t work. He also doesn’t want to refund $15K of equipment and have to explain his lack of analysis before selling it. So instead of doing what is right for the customer, he was going to force this system up, leave the client to maintain it, and hope it actually works in a mission critical environment. Since hope is not a strategy and apparently actual RF engineering and field testing don’t mean much to the SME, the customer is going to get hosed and we just have to walk away. The problem is that everyone, including us, assumed the SME was ethical and honest (which he had been prior to this project). When this installation fails, as it will, wireless will get another black mark and we will have burned a large manufacturing referral client. The ethical dilemma that remains is: who tells the end user they are getting a poorly engineered system from people that they trusted?
The only thing we can take from this experience are the lessons we learned:
- Take a deposit on these types of projects, at least for a site survey if the referring company doesn’t have qualified staff.
- Site-survey and inventory everything that was already purchased and on-site.
- When you have to re-design anything because of someone else’s assumptions and your spidey-sense starts tingling, stop, get everyone in a room to explain the situation and be prepared to walk away.
- Don’t assume everyone, especially the SME or the company that sold the equipment, has the client’s best interest at heart or is ethical. Trust but verify, especially if you have worked with the SME for a long time. People change.
Lessons learned this way are always the ones we don’t forget.